About on rnewman

Camellia oil: the ideal shave

Sat, 05 Aug 2023 00:00:00 +0000

Like many, I have a drawer half-full of part-used shaving soaps, brushes, and so on. Over the last year I began pursuing simpler solutions that would travel well, and had settled on the smallest soap sticks I could find: Speick and La Toja with a neat Mühle travel brush. I began using the sticks at home, too. It’s less hassle to lather on my face than in a bowl.

A few months ago I was planning a five-day backpacking trip, and realized that I often end up with an itchy neck by day four. I needed an option that would be suitable for lightweight or ultralight backpacking: I didn’t want to carry soap and a brush, and ideally wouldn’t need to thoroughly heat water, lather, and clean up just to shave.

Shave oil was the obvious option. But every brand I looked at was the same: packed to the gills with scents and botanicals, heavy on the menthol, and as much as $20 for a tiny bottle. I tried Somerset’s “Extra Sensitive Shaving Oil” (£11.50 for 35ml) and I have to wonder what the regular version is like — the “extra sensitive” was heavily scented and made my skin vibrate with menthol.

I tried shaving with some common cosmetic ingredients (e.g., squalane) and did not get a good shave. I didn’t want to use a typical food oil that would go rancid or attract bears in the backcountry, so I didn’t try olive or nut oil.

Eventually it occurred to me that I had a big bottle of stable, safe, non-food oil sitting on my shelf. I gave it a try, and enjoyed a bafflingly good shave: smooth, unscented, nick-free. The only downside is one shared with all shaving oils, a tendency to leave clumps of oil and stubble in the sink.

The oil I settled on is cold-pressed camellia oil (tsubaki abura), the oil of camellia japonica. A 100ml bottle is $10. I have been using it for years to oil carbon steel kitchen knives, which is its traditional use, along with machine lubrication (e.g., sewing machines). It’s tasteless, odorless, food-safe, and non-drying (all of which are good for kitchen knives!). Apparently it’s also non-comedogenic and used in skincare.

Six drops is more than sufficient to shave my entire face. I put some in a little dropper bottle for travel, and I’ve also been using it at home, because it’s fast, convenient, and gives just as good a shave as careful prep and luxurious lather.

For carry-on travel I pair the little dropper bottle of oil with a Bic Metal disposable razor, which is the simplest and best TSA-safe razor I’ve found. When I’m checking a bag I take my RazoRock Game Changer 0.84-P with a little sliding metal tin of Gillette Nacet blades.

As pictured, the Bic Metal weighs 5g, and the half-full dropper bottle is 9.4g. You could probably get that lower by using a different dropper bottle.

A total outlay of $10 in oil might be enough for me to shave at home and on the road for my entire life. Shaves at home with the Nacet blades cost me about 7¢. Shaves on the road with the Bic Metal cost about 25¢.

For backpacking and travel the oil also serves as an impromptu lubricant for squeaky gear that folds or turns, and as a skin emollient, but it’s small and light enough that it doesn’t need to be multipurpose.

My goal with a lot of gear is to get to a point of simply not having to think about it: to have found a solution that’s robust, reliable, and works so well that I stop looking for anything better. I have reached that point with this kit.

Tenets

Fri, 21 Oct 2022 00:00:00 +0000

I learned a lot during my time at AWS. One thing that I haven’t seen discussed much is how documented tenets can be used to structure a team’s path.

Tenets in this context are documented principles and beliefs that a team holds about itself, its users, and its products. They can be a useful guide for decision-making, and even the process of defining and phrasing the team’s tenets — and the ruthless prioritization needed to hit the expected limit of 5–7 — brings tremendous clarity. As with planning, the value is largely in the process, not the outcome!

Tenets can be set at the team level, the service level, or for smaller efforts and scopes that would benefit from directional clarity.

Writing good tenets is hard, and even tenured leaders often misunderstand the concept, losing some of the benefits.

A bad tenets list reads like a statement of values, a wishlist, simple priorities, or untethered aspirations. Our product will be fast, cheap, and available everywhere.

This blog post publishes some internal Amazon tenets (see the w.amazon.com links!).

A good example is from IoT:

Necessary offline operations: We help customers build systems in the cloud that work in predictable ways when connectivity is limited.

The point of writing up tenets is not to capture something that’s unambiguous or uncontested (products should be fast, the team should have a high bar for quality), nor to capture something that’s easily defined in some process doc or as a metric and thus does not require much human judgment. It’s to define a mental model for coping with conflicting priorities, to define a position in an exclusive value space, or to provide a mast to which a team under fire can tie themselves.

I’m sure it would be very convenient to tie some IoT capabilities to a network connection. Making a system work reliably with limited connectivity is hard and must be designed in from the start; it’s something that customers need help with; and it’s a requirement that engineering teams would love to drop! This tenet encodes an intention that steers decisions towards the inconvenient choice that’s right for customers.

Tenets like this are a statement in advance, while we still have the space and presence of mind to use good judgment, that we won’t do the expedient thing. It is a form of precommitment — the corporate equivalent of setting out your workout clothes the night before, because you know you won’t feel like exercising in the morning.

Accessibility is another good example. It’s important, it has top-down guidance in many product orgs, and yet teams will routinely sacrifice accessibility (and localization!) in order to hit a deadline, saying “yeah, well, accessibility is important, but of course we have to weigh it against shipping features…”.

A weak tenet like accessibility is important is strengthened by making it definite — our releases are always accessible to users of assistive technologies — or make the tradeoffs explicit: we prioritize accessibility over time to market, and will delay a release that doesn’t serve all of our customers.

A good tenet can capture discomfort and interaction patterns, too: we avoid excluding our remote teammates by making decisions asynchronously and with time for thoughtful comments, rather than in synchronous meetings. Again, there’s a tradeoff captured: that the team will go slower in order to be inclusive and make more measured decisions.

Other tenets can express an opinionated stance about product-market fit: our users are sophisticated and prefer detailed, accurate docs rather than vague marketing materials, or we accommodate our industry’s long sales cycles by treating trial users with the same priority as sold users.

At a leadership offsite my team tried an exercise: gather in small groups and write what you think the whole org’s tenets should be, then come together and discuss. The outcome was educational, if not surprising: many newer leaders wrote bad tenets; many wrote tenets that focused on the importance of their teams’ efforts, rather than the breadth of the business and our coordinated long-term success; and few wrote tenets that surfaced the essential tensions between their own priorities and those of other managers. This exercise is time well spent if you can spare an hour, but be prepared to work through the disagreements about what really matters for your teams!

Writing for an audience

Fri, 21 Oct 2022 00:00:00 +0000

Many Stripes are in the habit of writing “5:15s”: short updates that take fifteen minutes to write and five to read. My own staff notes are infrequent, on the order of every month or two.

In part that is because I find it hard to find the time to digest my week into a useful narrative. If you told 25-year-old me how many hours of meetings I would have each week, and how much time I spend on Slack, he wouldn’t have believed you, and would have gone back to writing code. (First he would have asked you what Slack was!) The raw dump would be overwhelming and unnavigable.

In part it’s because I spend a lot of time constructing custom-fit narratives for individuals or small groups, trying to serve their needs by shaping my experiences and perspectives into a parable and an account that succinctly gives them what they need for the situation in which they find themselves. A mentee recently apologized for presenting so many questions and problems for me to respond to, and I laughed — this is the most fun part of my job!

Mostly, though, it’s because I am acutely aware that a broadcast narrative — whether that’s an email, a project review update, a 5:15, or a conference talk — is a tool for changing an audience’s perspective, and getting that right requires both understanding the audience and providing (or assuming) enough context that the message can be practically brief. Often the context cannot feasibly be communicated, either because it is too early and unstable, or because it is so impractically large that the audience will extract no net value from what I have to say.

That leaves me where I am today: with occasional abstract updates.

Those of you who succeed at broader notes: I would love to hear your perspectives and ideas. Those who read these abstract notes: I am curious whether you find them valuable, or would prefer something more concrete (or different entirely).

First month at Stripe

Sun, 06 Feb 2022 00:00:00 +0000

I’m approaching the end of my first month at Stripe. I didn’t announce where I was landing in my earlier post out of a sense of counting my chickens, so I’m glad to change that!

It is a strange and amusing shift to go from working at a bookshop with a climate pledge to working at a publisher with a carbon sequestration arm.

After more than three years at AWS I was quite settled on my team, surrounded by people I knew well, and aware of the patterns swirling at various levels. I was also aware of my pace of learning having slowed, and I was tired of managing my cognitive dissonance. While I miss the people, the Principal Engineering community, and some unique aspects of my role, and I’m grateful for how much I had the opportunity to learn, I feel good about my decision to leave.

I also feel good so far about my decision to join Stripe: even as a relatively large startup, there is still a coherence of vision and some agility, and it really feels like we are all pulling in roughly the same direction. The people are kind, and that’s even expressed in the benefits decisions that management makes, like explicitly giving emergency medical and caring leave this year — too many of my friends at Amazon burned through PTO due to surprise COVID or caring responsibilities. No place is perfect, but so far I can live with the imperfections.

My first weeks at Stripe have been a firehose, as one might imagine: simultaneously learning a new domain (finance and payments), how Stripe’s businesses fit in to that, the technologies we use, the work that’s beginning, and the people. Stripe has a very thorough “101” onboarding program, but I quite quickly (too quickly?) ramped up on some real work, and I feel like I accidentally struck a good balance: after 19 working days I already have a web of relationships with my onboarding cohort, my new teams, and other staff engineers in my area, and I’m starting to develop the intuitions and questioning/routing behaviors that staff engineers use to do our jobs.

More importantly, I have managed to strike a good balance of time. Meetings happen 9–3 Pacific to avoid impacting east coasters, and Wednesdays are meeting-free for makers. I stop at 4 or 5, depending on when I started my day. Working at Amazon was always described to me as “intense”. So far Stripe is also intense, but bounded and mindfully so: everyone has given me advice to make sure that I don’t take on too much.

We have ambitious goals for the next twelve months, so it’s a relief to find that people view that as a long hike, not as a deathmarch. Onwards!

Tip: Synergy lag/jerking on macOS

Mon, 10 Jan 2022 00:00:00 +0000

I recently switched from using Across back to using Synergy for keyboard/mouse sharing.

I noticed substantial jerkiness and latency when my mouse was on the remote machine — pauses of a few hundred milliseconds every couple of seconds.

The culprit was AirDrop. Turning off AirDrop and closing AirDrop Finder windows allows silky smooth mousing.

I'm back; things changed

Thu, 06 Jan 2022 00:00:00 +0000

As I wrote last year, I didn’t feel like I had the space to write for public consumption while performing my role at AWS, and I would be back when things changed.

They have: I decided to leave AWS at the start of 2022.

As is tradition I then spent a couple of days fiddling with blogging systems, migrating my posts out of Medium and recovering a couple of posts from an even older blog. This is the result.

I don’t expect a particular cadence of writing, and I’m trying to not let the perfect be the enemy of good, but I’m optimistic that I will publish things now that I have an easy place to do so.

More next week.

A long pause

Sat, 25 Sep 2021 00:00:00 +0000

Working at AWS is very different to working at smaller mission-driven companies like Mozilla.

Even in projects where I get to think or work in the open, there is policy around communication that has a chilling effect on writing, as well as cultural pressure to use my time to write more leveraged blog posts to serve the interests of the team. That means I don’t write here.

This culture is in stark contrast to my pieces on Medium from 2018, my last year at Mozilla, in which I wrote to explain — and wrote to think through! — two problems with which I have grappled for long stretches of my career: how different groups of people (individuals, teams, or even communities) can reuse information that overlaps in space and vocabulary; and how different devices can change shared data over time and space without the need for heavyweight central coordination.

The AWS blog certainly attracts more readers than this little Medium, but it doesn’t serve those purposes of explaining and thinking through problems.

In occasional bursts of free time I return to the topic of structured storage, whether by writing index chunk join code to understand an algorithm, or more recently by digging into an identity challenge that is intrinsically decentralized, only to find JSON-LD (RDF in disguise!) hiding under the covers. Writing prose for public consumption, however, has fallen by the wayside.

I’ll be back when things change.

Thinking about Syncing, Part 4: keeping track

Thu, 31 May 2018 00:00:00 +0000

In Part 3 we argued that the concerns of application code differ from those of synchronization code. In this part we will take a moment to explore the latter in more depth.

The needs of synchronization

In Part 1 we discussed at a high level what a sync system has to do: move information around between its clients to allow them to reach agreement on the state of the world.

More concretely, syncing storage systems need to consider equivalence, management of identifiers, management of change, and detection of conflict.

Management of identifiers

Most systems need some way to identify things — entities in the domain of discourse. These might be numeric identifiers, globally-unique random identifiers (UUIDs/GUIDs), or salted hashes of some key. Sometimes a system will use one kind of identifier internally and another to sync. The allocation, rewriting, or in-out mapping of these identifiers is a concern of sync systems.

Equivalence

When multiple timelines can exist, it’s possible for the same conceptual entity to be identified by more than one name. Only systems in which every entity is given a stable identifier predictably derived from its unique attributes (such as a URL), or systems in which identity allocation is a centralized function, can avoid this.

When more than one identifier exists for the same entity, they can be ‘smushed’ — one is replaced with the other — or mappings can be built.

Management of change

‘Change’ is a blanket term. For our purposes, it consists of:

Addition: the introduction of a new fact into the system by one or more of its clients.
Update: the assertion by a client that an old fact has been replaced by a new fact.
Retraction: the assertion by a client that an old fact is no longer true.
Deletion: the forced un-stating of a fact so that it’s no longer part of any state, including any historical record.
Expiration: the pruning of old state, either on the individual client or system-wide, due to irrelevance. This is typically done to reduce storage footprint and improve speed. Expiration can result in clients diverging or introducing duplicate identifiers.
The expansion or alteration of the ontology or schema of the system itself.
A change in the set of syncing devices or the expected behavior of the system (e.g., turning a feature on or off).

Not only does the system need ways to model these things, but it also needs to keep track of ordering and progress: the same change shouldn’t be applied more than once, no matter the topology of the system.

Detection of conflict

When two changes conflict, the conflict should be detected and some resolution should be reached without data loss.

Beyond these concerns, we can list some requirements that we would like a real-world, practical sync system to meet.

Consistency (in the database sense)

The state that the system stores and syncs should remain consistent with some expected properties — e.g., required fields should be present and of the correct types. A change that breaks consistency must fail.

Eventual consistency (in the distributed system sense)

All clients in the system should converge on the same end state.

Quiescence

When no client is introducing new changes, no significant activity should occur: we expect the system to rapidly stop exchanging data once conflicts have been resolved.

Atomicity

It should not be possible for other clients to see only part of another client’s changes, because it makes maintaining correctness more difficult.

Incrementality

Adding new facts to existing entities (e.g., adding creation dates to all of your bookmarks) should not require disproportionate work to be done by clients.

Adding new entities should not require disproportionate work, even if those entities are related to other entities. (E.g., adding a new history visit should not require re-uploading the title and URL and previous visits of the history item.)

Adding new clients to the set should not require disproportionate work from other clients.

Continuation of service

Ordinary changes — data additions, schema extensions, client additions, etc. — should be routine and low-impact. We don’t want to lock out clients, force upgrades, or lose data as a result of a minor change; doing so harms the user experience and adds friction to engineering. When things are working they should continue to work.

A modest proposal

Applications must design for syncability when considering identifiers. We cannot avoid thinking about uniqueness, what constitutes an identifier, when they can change, how we refer to entities (which is, after all, the point of having an identifier!), and when two entities should be considered the same… and what we should do when that occurs. INTEGER PRIMARY KEY AUTOINCREMENT will end in tears.
If the application must support disconnected operation and the other constraints discussed earlier, then it should use a log-structured store to allow for automatic conflict-detecting sync. Application code should be prepared to resolve detected conflicts.
If the application’s data is relational — and choosing to model it in a non-relational tool doesn’t change this fact! — then we must think about how identification and constraints work in our domain. If a non-relational store is used, a relational layer will accrete on top, and abstractions leak. (Sometimes they are leaky by design.)
Use events to record data if you can. Derive more narrow representations from the event-shaped, log-structured data. But even event-shaped data has a schema, and storage should make that easy to evolve.
Use event-shaped APIs to separate storage representation and concerns from the application layer. setBookmarked(url, title, true) is worse than addBookmark(url, title, timestamp) . didClickStar(page, context) might be the right abstraction to aim for.
A data model that can correctly represent the domain, support syncing, and extend to future needs, will often not be a good fit for the querying requirements of the rest of the application. Resolve this tension by maintaining two (or more) representations, not by expecting a single representation to meet all needs.

Thinking about Syncing, Part 3: separation of concerns

Wed, 30 May 2018 00:00:00 +0000

In Part 1 we framed synchronization as exchanging information to allow clients to converge on a shared understanding of the world, specifically involving the merging of timelines.

In Part 2 we discussed several ways applications might do this: through snapshots (like Firefox Sync), change operations (like Google Docs), or revisions (like CouchDB).

We outlined a number of limitations of snapshot- and transformation-oriented approaches, and discovered that applications with certain requirements — offline operation, client-side crypto, etc. — might be best served by an approach that builds a concrete shared timeline between clients.

In this post we explore how the needs of a typical client application, and the data model that supports those needs, differ from the needs and corresponding data model of synchronization code. The UI usually wants to quickly examine the current state of a small slice of the world, while the synchronizer wants to reliably manage change over time. We will look at a way in which these two sets of concerns can be separated.

We will expand on the example given in Part 2, draw an analogy to DVCSes, and see some further examples of systems that separate the concerns of data management from the concerns of data consumers.

Events and tables

In Part 2 we saw a brief example of an eminently syncable data representation: the independent event, a stand-alone historical fact.

{"title": "Bohemian Rhapsody", "played": "2017–10–02T15:48:44Z"}

This is easy to sync because it can just be copied around: it doesn’t refer to anything (no identifiers to manage, and no identifiers introduced), it doesn’t reflect a change to existing data, and it can’t conflict with anything.

But this isn’t everything we need for a typical app. Let’s take a step back and think about features, and how those relate to storage.

There’s implied and missing context in that event.

We want to be able to record that this user played a particular song — not just one with that title! — by that artist, in a playlist, on a device, and so we run into issues of identity, uniqueness, and reference to other entities. Modeling these things is non-trivial: like most application data it’s relational, even if it’s stored in a document database, and that means we need careful management of identifiers and consistency.

We want to record facts that change over time — star ratings and playlist memberships, renaming playlists, and so on. That means having some conception of updates and breaking of relations.

We need to handle deletions and creations (e.g., recreating a playlist with the same name) with care. And we need a way to permanently delete data; some guilty pleasures need to be forgotten.

These things need to sync, so the changes you make on your phone are reflected on your laptop.

A log-structured model is a good fit for this: we can record additions and changes as they happen, and merge our changes in when we sync. We can record new kinds of data easily.

But we’re not done yet: the front-end code has its own requirements.

We want to slice and dice this data to support the UI on all of a user’s devices. We want them to be able to quickly find the last ten songs they played, their top 20 most played songs, their top rated songs. They need to be able to browse their current playlists, search by artist or date, and sort the results.

It’s straightforward to see how our syncing requirements map to an event log, but these front-end retrieval tasks are expensive with a pure event model: play count is a sum aggregate, star rating is last-write-wins, last played is a max aggregate.

Conversely, it’s easy to see how these front-end features map efficiently to a conventional tabular or object-based storage system, but it’s hard to implement a change-based or log-based syncing system on top of a table for playlists, a table for songs, etc. with SQL UPDATE queries. With in-place updates we must manually manage timestamps and change counters and tombstones.

We’d like to be able to build new front-end features without affecting, or even fully understanding, how the supporting data is synchronized. And we’d like to be able to reuse or change our synchronization code, or extend the data that’s synced, without having to worry about the existing complex query needs of the UI. This is classic separation of concerns.

Our sync-related requirements are in tension with our ‘direct’ application requirements. Fortunately, our little music app isn’t the first to have to resolve these tensions.

An abridged history of version control

Most of us — developers, writers, musicians, and more — start out using unsophisticated tools to manage change: copying or zipping directories if we need to preserve an older version of some work.

If a developer need to move some changes between those different directories, they manually copy files. A sophisticated user might point diff at the relevant files to produce a patch, edit it, and apply that patch with patch.

If you’ve ever had a file on your desktop called something like Essay (v1) final draft FINAL(2) EDITED (TO PRINT).pdf, then you’ve used this method.

We might jokingly call this snapshot-oriented version control.

Early version control systems improved on this somewhat. RCS tracked per-file versioning metadata in an adjacent ,v file in the filesystem. It turns out that the needs of file-oriented tools that use versioned data — source code, documentation, etc. — are very different from the needs of version control tools themselves, particularly at scale.

Build tools and IDEs want fast hierarchical access to the current state of your source code, but version control tools want to do things like quickly list every user who changed files in a directory, across moves and renames, in the last fifteen years. Your linter wants to find files missing a copyright header, but your coworker wants you to send her that debug commit that you never landed.

CVS and later VCSes split files and changes, beginning to record log files, to store deltas, and to support atomic operations over multiple files.

Modern version control systems like Git completely divorce the working tree — the files on disk — from the internal data model of the version control system itself. A Git repository can exist with no working tree at all, or with multiple working trees. The working tree is simply a checkout, a convenient instantiation of a particular state in the repository.

When a Git client updates one repository from another, it does so by replicating objects — internal representations of new files, trees, and commits — then advancing refs, then optionally rebasing local changes on top of the new remote changes, and finally optionally updating a working tree to match the new head. Git doesn’t send operations, it efficiently packs changes, and the remote repo doesn’t need to know anything about the local working tree.

We can even produce different kinds of checkouts from a single Git repository. It’s easy to grab individual files at any point in history using git show, and check out only part of a tree with a sparse checkout. And users can extract different kinds of non-file data from these tools, too.

The concerns of the data management tool, and the concerns of the consumers of the data it manages, are very different. Modern DVCSes resolve this tension by using two different data representations, deriving each from the other when necessary. The internal data representation is the one that manages change, consisting of atomic commits arranged into branches. The secondary, derived representation — a tree of files — is the one typically consumed by user-facing applications.

Seeing similarities

Once we see these tensions, and the solution of not sharing a representation between consumers, we can see the same separation in other places — email clients, photo libraries, even the MVVM pattern.

Some good examples are found in the database industry itself.

A common configuration of SQLite writes changed rows to a Write-Ahead Log (WAL), rather than directly updating the main database pages. This allows SQLite to get writes on disk cheaply and support concurrent readers and writers. The main database file is updated by ‘replaying’ the WAL. PostgreSQL supports log-shipping replication on top of its WAL; the main database format can be specialized to meet the needs of queries, rather than having to accommodate replication metadata.

As with DVCSes, these internal models of change can also offer value to application code — git log is useful! CouchDB and PouchDB expose change feeds — the same data they use for reliable syncing — to application code. Datomic is structured around a transaction log that is the source of its current indexed state, encouraging applications to take advantage of long-lived persistent data.

DVCSes have tensions between log-centric and file-centric consumers, and they resolve them by deriving the working tree from the repository.

SQL databases have tensions between readers, writers, and replication, and they resolve them by (in very simple terms) deriving database tables and indices from a written log that is also available to replicate.

A document store like CouchDB has to act like a simple object store while also managing multi-master replication and conflict, and it does so by storing a tree of document revisions.

We can see that a similar separation of concerns can apply to client-side application storage. Applications should structure their writes as a log, and derive tabular or object-oriented representations from it.

Generalizing the argument: CQRS

Synchronization is not the only feature in tension: the needs of different parts of the application can be at odds with each other. They benefit from having different representations, too.

In a browser we might naturally store bookmarks as a tree in memory, or in a flat file, so they can be shown in folders. We also need fast textual search over the titles, for which we would use a full-text index. We want fast lookup by URL to check whether the current page is bookmarked, so we want some kind of indexed lookup there, perhaps a Bloom filter. We want bookmarks to share icons, so we need some way to store and identify those. We want to find the last five bookmarks the user created, so we need some kind of timestamp index. And of course we want to write a new bookmark quickly without updating all of those structures! Features grow over time, and data stretches to try to keep up.

This isn’t news; certainly not for anyone who works on big sites.

CQRS asserts that ‘command’ (writes) and ‘query’ (reads) are best handled with different data representations, even a different representation for each reader. Event Sourcing declares that what products want to do with data is going to change over time, and so we should record data as generally as we can — as events — in order to adapt. Even ad hoc log-oriented enterprises, those that haven’t had a consultant fill whiteboards with sagas and schemas, take for granted that different tools will build varied representations from the same log.

Client apps often do keep multiple representations of data, but in an ad hoc way: write-back in-memory caches, DB indices, and queries that update two tables. By structuring the application around a rich log, it becomes relatively straightforward to derive multiple specialized representations, and add to and change those representations over time.

This is not a new observation in the wider industry.

Leading analytics tools like Amplitude, and Mozilla’s own data platform, are designed as aggregators of immutable logs, constructing derived data sources to support various query systems, including SQL-based Redash queries.

If you talked to a data analysis engineer, and told them that you were going to drop raw event data on the floor as soon as today’s derived dashboards were compiled, rather than warehousing them to answer a different set of questions next week, they’d be horrified. Yet this is what we do when a client app turns a user action, like clicking a toolbar icon, into a decontextualized INSERT INTO bookmarks (url, title, date) …: we cut down a rich arrangement of data, data that is of interest to other features, into a single simple representation that’s specialized for a particular use. We can do better.

Conclusion

It’s better to sync logs than state or changes, particularly when data is encrypted or devices go offline.
Log-structured data is well understood outside of client apps: it’s the underpinning of DVCSes, databases, and some web apps, as well as being a core part of analytics systems like Amplitude.
Different parts of larger client apps have differing query needs, and those needs are in tension, too.
Understanding an app’s data as a log of changes, transformed into varied representations for specific uses, not only brings clarity to syncing, but also makes it easier to target those differing needs.

Still to come in this series: more details on merging; more concrete exploration of how an application might be structured around a log, from modeling the domain through to defining views; and discussion of the differences between event-structured and log-structured data.

Thinking about Syncing, Part 2: timelines and change

Wed, 01 Nov 2017 00:00:00 +0000

In the first part of this series we established some definitions, concluding with a framing of synchronization as the merging of two timelines.

In this part we will examine some broad approaches to that problem.

The state of things

Some applications aren’t initially designed to sync. They’re designed to support a particular user experience, and their data storage reflects that experience. Sync support is added later.

When wiring syncing into an existing offline-first application, engineers tend to change as little as possible. (Indeed, Firefox Sync began as an optional add-on to Firefox, with no changes to Firefox’s internals at all.)

The application continues to store only the current local state — a snapshot, an instant along the timeline. Parts of this persistent state are now annotated with metadata — change counters, timestamps, global identifiers, tombstones — to allow changes to be tracked, extending the state to address the concerns of the sync code. Each time the state changes the old state is lost; only the metadata serves to indicate that a previous state existed.

The last synced state is kept on a server (often a simple object or document store) in much the same way. A simple latest-wins form of conflict resolution suffices to manage change, hoping that few conflicts will occur.

There are various permutations of this depending on how merges are performed and which actors in the system do the work, but the idea is the same. In this post we’ll call this snapshot-oriented sync.

Snapshot-oriented sync is simple and mostly non-intrusive, and for some applications it’s sufficient.

Merging

A client that fetches changes from the server and merges them with local state, without reference to earlier states, by definition performs a two-way merge (which isn’t really a merge at all!). If only one client writes at a time, this is all we need; the other timeline is known to be the previous state, and the change applies without conflict.

A more sophisticated client that tracks the earlier shared parent (prior to remote changes, local changes, or both) can perform a three-way merge, which is less likely to lose data when two clients make changes around the same time.

Note that in a snapshot-oriented approach — one that looks only at the current state of each client — the intermediate states aren’t available for use in the merge: some aren’t published to the server (and the rest aren’t preserved), so clients never see each other’s states, and the application doesn’t record its own fine-grained history.

At this point we don’t need to know how a merge works; it’s enough to know that a merge produces a unified timeline, and we can have more or less information available when we do so. A snapshot-based system might merge changed fields, or take one side over the other based on a timestamp, or even pause and ask the user to resolve a conflict.

The longer the duration between syncs, the more frequent the changes, the more coarse-grained the entities being synchronized, and the larger the volume of data, the more likely it is that the limitations of this approach will become apparent. It’s also difficult to avoid data loss without doing the necessary bookkeeping to support three-way merge, but that bookkeeping is onerous.

Tracking of change over time in a snapshot-oriented system can be done in a number of ways: e.g., if an application’s data model is relatively straightforward, clients might include in each object an explicit record of the client’s view of recent changes, using that record to do a better job of merging, or clients might keep a copy of the last snapshot seen during a merge.

In snapshot-oriented sync there is no first-class, explicit (certainly not explicit and long-lived) model of the passage of time, of change, or even of the state of other devices. As a result it’s hard to support features like restore, and it can be difficult to reason about the sync-related properties of the system.

Converging on change

Rather than copying state around, implicitly deriving and reconciling changes, we can instead copy changes around, and work with those. The goal is to reach the same state — to converge — on each client.

We can try to do this in several ways. We can send carefully designed events, as in Operational Transformation, relying on each client to transform and apply each operation in order to converge on the same state. We can restrict ourselves to a particular set of datatypes (CRDTs) that allow us to either send commutative updates or guaranteed-mergeable states between clients, such that by definition each client will reach the same state. Or we can track enough state that we can send deltas or diffs to each client, as in Differential Synchronization.

All change-oriented methods have advantages over an ad hoc state-based approach, but they make tradeoffs, have significant difficulties (particularly around ordered delivery of operations), and accrue some complexity: DiffSync must maintain shadow copies, OT must take great care with transformation to ensure that the system converges, and CRDTs limit storage design. OT and DiffSync have been widely used for collaborative editing tools, but these techniques are harder to apply to long-lived structured data across offline devices. There is very little discussion in the literature about the application of these specific techniques to client-encrypted data, where the server can’t help with diffing. (If you know of some, please let me know!)

We can call the application of these techniques to syncing transformation-oriented sync, because clients exchange the changes that should be applied to the other clients to achieve a desired outcome. (CvRDTs are technically an exception.) In transformation-oriented sync there is no long-lived model of change over time. The syncing protocol is nonetheless quite invasive and closely coupled to storage, because it relies on an accurate stream of changes as they occur, or tightly constrains the data structures used.

Events

If your application works solely in terms of commutative changes, with minimal or no deletion, and little need for managing identifiers, then simple replication of events — a trivial kind of CRDT — might be sufficient. For example, consider a music player that records an event each time a track is played, by name only. We can model this with a simple event object:

{"title": "Bohemian Rhapsody", "played": "2017–10–02T15:48:44Z"}

Clients can share these objects, and as soon as each client has the same set, they all have the same state.

Unfortunately, most applications are not this simple. How do we identify events in order to delete them? How do we represent richer track entries that might be renamed, linked to artists or albums, or be associated with cover art? How do we order events without comparing client clocks?

This approach is instructive, however, and we’ll return to it in Part 3.

Looking back to look forwards

In Firefox we see writes on different clients that conflict and need to be resolved with some finesse: one device bumps the lastUsed field of a login, another saves your new password in the password field, they race to sync, and — because desktop Firefox only does two-way merges — your password change is undone.

Firefox clients spend prolonged periods offline. Your work machine is offline all weekend, your cellphone is in Airplane Mode during the flight while you use your tablet on the overpriced Wi-Fi, and so on. Even when they are online, they can’t assume cheap, high-bandwidth connectivity.

Moreover, the set of Firefox Account clients isn’t definitively known in advance; a new device can sign in at any moment.

An application’s requirements might be even broader. Perhaps there’s a need for an audit trail or backup/restore functionality, or maybe it’s important that each client converge on exactly the same state. Convergence is a property that’s often expected but can’t be assumed, and indeed Firefox Sync clients do not necessarily converge, sometimes in surprising ways.

We know that snapshot-based syncing is limited: it forgets the sequence of events, making conflicts more difficult to resolve, and it has lots of other limitations. It seems natural to look for a mechanism that syncs changes, rather than snapshots. But transformation-oriented approaches, as used in Google Docs, are not a natural fit for long-lived, offline, encrypted systems.

With these requirements in mind there’s little alternative but to design around a history of change on each client and on the server, with one history being canonical. This ongoing record tracks enough of each timeline to allow for a three-way merge, with conflicts detected, and for offline and new clients to catch up incrementally without data loss. Clients themselves have the cleartext to merge.

Depending on other factors (e.g., whether clients need rapid random access to server storage), this design tends towards an implementation in which clients store and exchange deltas or revisions, collaboratively building a shared concrete timeline. DVCS tools are just like this: clients store sequences of commits, using merges and rebases to combine branches. In Git or Mercurial terms, the current agreed-upon state of the world is a particular branch, usually called master, and when a client pushes changes that branch is atomically advanced to a new ‘head’, a particular state that evolved from an earlier master.

We will call this log-oriented sync, because clients record a log of states — their own timeline — that diverges from a known shared history, and then refer to the logs of two timelines when merging. This is subtly different from transformation-oriented sync, because the log represents a sequence of recorded or merged states rather than an outgoing queue, and because the clients agree on history: there is a strict shared ordering of states. Local changes might need to be rebased or augmented when two timelines merge, but they definitely already happened.

CouchDB works like this: a revision tree stores history, automatically resolved conflicts refer to earlier revisions for manual repair, and synchronizing two CouchDB or PouchDB databases involves bidirectional replication of revisions.

Jay Kreps wrote an excellent and comprehensive treatise on the centrality of the log to modern data systems; it’s well worth the read.

Aside: the role of the server

Firefox Sync uses client-side crypto to achieve privacy guarantees, which by definition means that a central server can’t use the content of stored data to manage change control.

However, that only restricts the system’s ability to use a central server to resolve conflicts. For our purposes, let’s define a conflict as the presence on two timelines of two incompatible changes.

In a system with client-side crypto, conflicts typically need to be resolved by a client that can see cleartext.

It’s still possible to use a central coordinator to serialize writes and detect collisions, and of course to deliver changes to clients. It’s even possible to find data via salted and hashed metadata, allowing random or attribute-based access: Firefox Sync does this by giving each record a GUID and an optional sortindex field.

Conclusion

Bearing in mind that a sync is the process of reaching agreement between clients, we covered three basic approaches:

Exchange snapshots of state, inferring changes when needed to resolve conflicts.
Exchange streams of changes or diffs, applying them to each client’s state so that each client should converge on the same end state.
Maintain a relatively long-lived history of changes, periodically merging each client’s timeline back with the main timeline.

This only partly answers the question of how to model application state; clearly there is interplay between how application data is represented and how it is synchronized, but that’s a long way from saying something concrete like “store state as JSON objects with vector clocks”. This is a big topic, so we’ll get to it in Part 3.

Thinking about Syncing, Part 1: timelines

Wed, 11 Oct 2017 00:00:00 +0000

I’ve been thinking about syncing data — in particular, about Firefox Sync, systems that touch it, and systems that might replace it — for about seven years now. I’ve been thinking about data representation in general for most of my career.

In that time I’ve begun to piece together a practical understanding of what we mean when we talk about syncing. This series of blog posts aims to capture some context and my mental model, and draw some conclusions that I think are true, given the constraints.

Comments, clarifications, and pointers to literature that I’ve missed would be very much appreciated.

In this blog post we will introduce the concept of a timeline, which is a linear sequence of states, and define syncing for our purposes as the process of merging two timelines.

Subsequent posts will cover some approaches to merging timelines, examine how the concerns of synchronization are in tension with other parts of an application, and suggest a model for resolving that tension by separating concerns.

Let’s start at the beginning: by exploring what sync systems do.

“To sync”, at its most abstract, is to take two or more storage systems and make them agree on the state of their shared world.

There can be several reasons for doing this — to deliver a consistent user experience across devices, to safeguard the user’s data, to enable new product features — but the method is similar regardless.

Typically those systems will persist data, but in some cases it might be temporary. For example, Firefox Sync’s tab sync doesn’t write to disk. The current tabs are all fetched from the server when you launch Firefox, kept up to date during the session, then dropped when you quit.

Typically their interactions will be mediated by a central server, but they could also communicate directly.

Typically the state of the world is application data, usually data that reflects a user’s activity.

Typically clients (in my world they’re rich client applications like browsers) will sync more than once, keeping themselves “in sync” over time, but we can also imagine situations in which a client syncs only once.

Firefox Sync is typical in these senses. Your Firefox profiles periodically synchronize long-lived user data with each other, communicating via encrypted payloads stored on a central server that acts as a shared whiteboard.

The process of syncing is, at a slightly less abstract level, the process of exchanging enough facts that the clients can then converge on agreement.

Exchanging facts and reaching agreement can be as simple as a one-way overwrite of a file, or as complex as a fine-grained automatic merge over local Wi-Fi.

We can think about the commonalities of these processes in terms of timelines and merges.

Timelines and merges

Consider a particular state — your current to-do list, perhaps. That state is the successor of an earlier state, and the precursor of a later state. In the to-do list example the earliest state is an empty list. As you add items, delete them, and mark them as done, you change one state into another. We will call this ordered sequence of states a timeline. If you’re familiar with version control systems, you might think of this as a branch, though unlike VCSes most CRUD apps don’t keep the old states around.

The simplest networked systems don’t sync at all; there is a single timeline on a server, and only rendered state is held by the client. Traditional CGI web apps are this way: all persistent data is stored on a server, and changes are made by sending a request to that server. Your to-do list might look like this:

The next most simple are those that keep a copy of the state itself on each client, but only change it in one place. All attempted modifications are sent to the server to be applied to the canonical state. The server responds with updates to the client’s state; the server’s updates always win over any speculative changes made by the client. This is how Redux apps often work. This is syncing without any trouble: it’s a single loop of state transfer. There’s only one timeline, even if multiple clients are asking the server to make changes. All actions move inwards to the server, changes move outwards to the clients, and control over races is in the hands of the server.

When we think of “syncing”, though, we usually think of systems in which multiple timelines can occur as part of normal operation. Those systems are more complex than an ordinary web app: they require the clients or the server to sometimes combine two timelines into one, or to maintain two timelines in parallel.

We can start to form a taxonomy of the more complex kinds of system by asking some questions. These questions are phrased in terms of a central server, but they generalize to peer-to-peer or local-only setups.

Can you record any permanent data before beginning to sync?

If so, multiple starting timelines need to merge at least once. This situation is unavoidable for a product with as long a history as Firefox — we have millions of users with precious bookmarks and logins on their devices, and they want to keep them when they start using a Firefox Account. If those users have more than one device, they’ll start with at least two separate timelines.

(Note that these little diagrams represent timelines — sequences of states — not the state of each device. After the merge there are still two clients, but they both agree on the same state of the world: they are at the same point on the same timeline.)

Can you record data while temporarily partitioned from the syncing infrastructure?

Clients might not be constantly communicating with the server or other clients. They might face network interruptions or outages, be running on a device that forbids background syncing or cellular data use, or the app might only sync on a schedule.

In these scenarios, clients must buffer outgoing changes, and the client or the server must perform little ‘mini merges’ on each sync. State changes on two devices temporarily result in two timelines, both of which must merge back.

Can you stop using sync and keep your data?

If you stop syncing altogether — not just a temporary blip, but really signing out — can you record more data, then later resume use of sync? Or can you sign out of one account and sign in to another account with the same data?

This question implies some complex weaving of timelines; indeed, the production of new long-lived timelines. Each client can sign out, record some new data, restore from backups, and do all kinds of things before signing into the same or another account. Users really do these things (and worse!) and expect something sensible to happen. Some Firefox Sync users routinely sign in and out in order to tightly control when syncing happens, or to try to build unidirectional syncing workflows.

A user might even decide to take a device from one account to another.

Can a user rewrite history?

Can you restore from a local backup while sync is enabled? Can you permanently erase data?

Permanently erasing data — e.g., clearing your recent browsing history, or deleting saved logins — requires scrubbing the ‘official record’. Given that the official record is spread across multiple clients, and perhaps one or more servers, this can be complex. Even ordinary deletion isn’t necessarily trivial.

(Firefox Sync gets this wrong in some edge cases. The most obvious is history: visits, described by their timestamps and reasons, are stored as part of their parent history record, which holds the URL and current title, and that structure has no way to represent the deletion of an individual visit. Consequently, using the ‘Forget’ feature in Firefox to erase just part of your long-term interactions with a website has no effect on those visits that have already synced to another device.)

Conclusion

We began by defining syncing as the process by which clients converge on agreement about the state of their shared world.

We defined a timeline as an ordered sequence of states. We recognized that the simplest form of syncing is a one-way transfer of state corresponding to steps along a timeline. We came to a stipulative real-world definition of syncing as the transfer of such state that might require the merging of two timelines. We asked a number of questions, the answers to which determine when timelines might be created or merged.

Many apps say “no” to some or all of these questions. Firefox needs to support all of them; Firefox Sync clients have complex state, and periodically merge it with changes from other clients.

Saying “no” to some of these questions constrains the definition of syncing. Indeed, if you say “no” to all of the first three — mandatory account, no offline work, no disconnect — the app doesn’t meet our stipulative definition of syncing at all. Such an app is a website or a remoting system like X Windows or VNC, not a syncing system.

In future parts we will touch on the representation of states and timelines in an application, discuss some broad approaches to syncing, examine some of the tasks that must be performed in a syncing system, and finally discuss how support for sync should affect API design and how a client stores data.

Part two is next!

A conceptual introduction to Project Mentat

Tue, 21 Feb 2017 00:00:00 +0000

This post is intended for a particular audience: developers who perhaps haven’t done lots of database or data representation work, want to get started working on Mentat, and are looking for a ‘quick fix’ of context.

You should have already read Introducing Project Mentat. This post might fill in some gaps for you.

This post covers, very briefly:

What is a database?
What is a schema?
What is event sourcing?
What is Datomic?
What is SQLite?
What is Project Mentat?

The answers are relatively brief and somewhat opinionated, but they should offer enough of a starting point for further research. Disagreement is welcome!

What is a database?

Databases of all kinds share a common goal: to provide persistent storage and querying to one or more applications. Beyond that, different tradeoffs yield a surprising variety of solutions.

The traditional “databases”, as most developers now understand the term, are relational SQL databases. The only interaction is via a textual query language, SQL. These databases are typically ACID: atomic, consistent, isolated, and durable.

Atomic means that a write (indeed, a related collection of writes) either happens in its entirety or doesn’t happen at all.

Consistent means that rules and conditions expressed in the database (e.g., foreign key constraints, NOT NULL constraints, type definitions) continue to apply at all times.

Isolated means that readers and writers don’t see each other while they’re working. Writes conceptually happen in one instant in time, and within a particular transaction you are isolated from those moments.

Durable means that once data is written it isn’t lost.

Note that these properties have nothing to do with whether a DB uses SQL, is relational, or otherwise. However, so-called “NoSQL” databases often neglect some of these properties: it’s not uncommon for them to lose data that’s been confirmed as written (MongoDB being the most mocked example), not expose transaction primitives (or for their transactional guarantees to only apply within a single row or document), or not bother with consistency constraints at all.

Some have argued that these properties (and other characteristics of relational databases) are obsolete: for example, that in-database consistency constraints are better handled by business logic inside an application. Different databases draw these lines in different places.

You might have heard terms like “eventually consistent” when used to describe distributed systems. (Eventual consistency simply means that, if you wait long enough, all readers will see the same last write.) Distributed systems are hard, and many hosted NoSQL databases are clustered in order to scale, forcing them to contend with the CAP theorem. We won’t dig any deeper into that, because for the purposes of this post we aren’t concerned with distributed storage.

Different databases have opinions about the kinds of data they store and the way they model it. Sometimes these opinions are so pervasive that we don’t really notice them.

Relational databases store relations between entities. SQL databases model relations as tables with an arbitrary number of columns. Entities are rows in some table, and are identified by keys. Queries join relations (tables) to yield new relations. SQL databases are not ideal for storing graphs — graph traversal requires recursive joins, which is a relatively recent SQL feature — documents, unstructured data, etc., though PostgreSQL is often good enough.
Document databases store content without establishing an up-front explicit schema. They often use JSON or XML as a native data format. (‘Schemaless’ storage seems like a time saving, but see below.)
Graph databases model data as links between nodes. Sometimes those links can themselves be annotated. Querying is via path traversals and conditionals, which can be very natural for some domains. Typically graph databases are designed to be faster at graph operations like “find me related actors” than querying a graph modeled in another kind of database.
Geospatial databases focus on spatial coordinates as a primary way of finding things.
Applications often use an ad hoc flat file as a database: it’s read into memory and flushed to disk when changed. Typically this is a knee-jerk response to a badly configured database (“I don’t need all that complex database stuff!”), or to a proliferation of independent databases, and it forces the application developer to manually choose when to flush, how to query data in memory, how to handle scaling, etc. Flat files are a good solution for data that rarely changes, isn’t concurrently modified, and is simple to query; configuration files are a good example. It’s a bad solution for data that changes frequently and needs to be read or written transactionally. Firefox has well-documented problems with session store.
And so on.

What is a schema?

A description of your domain. A recipe for the shape of your data. Go read Martin Fowler’s take on schemaless storage.

Database schemas are also often where indices/indexes are described: relational databases conflate semantic, structural, and index descriptions of data into a single schema.

Relational databases (and others) use indexes to make queries fast. An index in a relational database is (usually) a copy of all or part of a table, stored in a different order, with metadata to facilitate finding the right values.

What is event sourcing?

The concept that your application state is the end result of a sequence of events that took place since an earlier state, and the idea that the fundamental modeling construct for dealing with this is to record the sequence of events directly, deriving other data structures from the event stream. This should feel very familiar to React/Redux developers.

Event sourcing is loosely related to CQRS, which is the idea that the readers and the writers in your system are best served by different data representations. Our approach is not opinionated — CQRS, not ES — but you’ll often see them discussed together.

Again, go read Martin Fowler, and see this list of further reading.

What is Datomic?

Datomic is a closed-source database, written in Clojure and running on the JVM, and built and maintained by Cognitect. Datomic has a rich schema language, stores relational data (albeit a little more loosely than a SQL database does), and is distinguished by its attitude to time and change. The history of all changes is accessible to application code. The schema is similarly accessible. Schema definitions can evolve over time, with older definitions available just like older data. Applications can query past (and hypothetical future!) states of the system.

Datomic’s record of the data it stores is — unlike traditional relational databases — very aware of time and state, in a similar way to how Clojure makes explicit the distinction between values and identity in state.

All databases in consumer applications need to handle changing and growing data over time; Datomic includes this as part of its data model. By contrast, most databases entirely forget that changes ever took place, with changes only stored in a log for long enough to provide durability or replication: the stored data in the database itself at the current time is purely a snapshot, and applications that need to reflect time and change in their data model must do so explicitly. “Deletion” in Datomic is actually one of two things: retraction (stating that an earlier fact is no longer true) and excision (cutting out part of history as if it never took place, typically for legal or privacy reasons).

There are lots of other things that are a bit special about Datomic, like the distinction between peers and the transactor.

Read a conversational introduction (part 1, part 2), and watch Rich Hickey explain:

Datomic is a service that runs alongside a broad array of existing storage systems (including AWS and Cassandra), using them to store index chunks.

What is SQLite?

SQLite is a very stable, quite fast, extraordinarily well-tested, embedded SQL database. Embedded (also called “serverless”) databases are no longer that common; most SQL databases — indeed, most ACID databases — are relatively large hosted servers like PostgreSQL, MySQL, etc.

We use SQLite extensively in Firefox, and Mozilla has a good relationship with its developers.

What is Project Mentat?

Mentat is an embedded datom store: essentially Datomic’s data model and schema interface expressed on top of a SQLite database.

Naturally, many of Datomic’s concepts — e.g., scaling reads by replicating index chunks to peers — don’t apply, and the concept of database-as-value is less relevant in an embedded system. But we preserve the ideas of a first-class transaction log, a domain-level schema, transaction listeners, and so on.

The principal advantages of Mentat in applications like Tofino are:

It’s natural to grow the schema and make new relations between entities. Schemas change all the time in living applications.
Schema modeling is done at the domain level (“a page can have multiple visits”) not at the storage level (“the visits table has a column with a non-unique, not null foreign key constraint that refers to the pages table”).
Different parts of the application can cooperatively share a single database.
The transaction log is available for querying (and for synchronization purposes).
The query language makes it easy to express joins, particularly graph-like self joins that are very complex in SQL. Here’s an introduction to the Datalog query language used by Datomic, DataScript, and Mentat.
The architecture of the database makes it natural to address performance via materializing views and indexes, either inside or outside the database itself. For example, an attribute can be marked for full-text searching just by adding “:fulltext true” to the schema. Applications see every transaction as it occurs, and can thus build their own caches.
Many of the mistakes made by developers adding ad hoc flexibility to a database — such as a “metadata” table containing strings, resulting in inefficient storage and slow querying — have been avoided: the schema itself offers enough flexibility that stringly-typed storage is unnecessary.

Mentat uses a combination of SQLite’s own ACID properties and sequential writes to achieve ACID guarantees (more or less).

It’s worth recognizing at this point that how Mentat stores data in SQLite is an implementation detail. We could split up our datoms table into pieces; we could store the transaction log in an ATTACHed database; we could even automatically derive traditional ‘wide’ database tables where appropriate. The abstraction boundary is quite opaque, and only the transactor and the query engine need to know about the details. Abstracting storage in this way is itself valuable: we can make significant changes in how Mentat is implemented without altering our API surface, and improvements under the surface are immediately available to all consumers.

What next?

This post covered some context, but doesn’t address exactly how Mentat is built. Some of the pages on the project wiki cover that, but another post might be forthcoming.

Introducing Project Mentat, a flexible embedded knowledge store

Tue, 15 Nov 2016 00:00:00 +0000

Evolving storage is hard. Can we make it easier?

Edit, January 2017: to avoid confusion and to better follow Mozilla’s early-stage project naming guidelines, we’ve renamed Datomish to Project Mentat. This post has been altered to match.

For several months now, a small team at Mozilla has been exploring new ways of building a browser. We called that effort Tofino, and it’s now morphed into the Browser Futures Group.

As part of that, Nick Alexander and I have been working on a persistent embedded knowledge store called Project Mentat. Mentat is designed to ship in client applications, storing relational data on disk with a flexible schema.

It’s a little different to most of the storage systems you’re used to, so let’s start at the beginning and explain why. If you’re only interested in the what, skip down to just above the example code.

As we began building Tofino’s data layer, we observed a few things:

We knew we’d need to store new types of data as our product goals shifted: page metadata, saved content, browsing activity, location. The set of actions the user can take, and the data they generate, is bound to grow over time. We didn’t (don’t!) know what these were in advance.
We wanted to support front-end innovation without being gated on some storage developer writing a complicated migration. We’ve seen database evolution become a locus of significant complexity and risk — “here be dragons” — in several other applications. Ultimately it becomes easier to store data elsewhere (a new database, simple prefs files, a key-value table, or JSON on disk) than to properly integrate it into the existing database schema.
As part of that front-end innovation, sometimes we’d have two different ‘forks’ both growing the data model in two directions at once. That’s a difficult problem to address with a tool like SQLite.
Front-end developers were interested in looser approaches to accessing stored data than specialized query endpoints: e.g., Lin Clark suggested that GraphQL might be a better fit. Only a month or two into building Tofino we already saw the number of API endpoints, parameters, and fields growing as we added features. Specialized API endpoints turn into ad hoc query languages.
Syncability was a constant specter hovering at the back of our minds: getting the data model right for future syncing (or partial hosting on a service) was important.

Many of these concerns happen to be shared across other projects at Mozilla: Activity Stream, for example, also needs to store a growing set of page summary attributes for visited pages, and join those attributes against your main browsing history.

Nick and I started out supporting Tofino with a simple store in SQLite. We knew it had to adapt to an unknown set of use cases, so we decided to follow the principles of CQRS.

CQRS — Command Query Responsibility Segregation — recognizes that it’s hard to pick a single data storage model that works for all of your readers and writers… particularly the ones you don’t know about yet.

As you begin building an application, it’s easy to dive head-first into storing data to directly support your first user experience. As the experience changes, and new experiences are added, your single data model is pulled in diverging directions.

A common second system syndrome for this is to reactively aim for maximum generality. You build a single normalized super-flexible data model (or key-value store, or document store)… and soon you find that it’s expensive to query, complex to maintain, has designed-in capabilities that will never be used, and you still have tensions between different consumers.

The CQRS approach, at its root, is to separate the ‘command’ from the ‘query’: store a data model that’s very close to what the writer knows (typically a stream of events), and then materialize as many query-side data stores as you need to support your readers. When you need to support a new kind of fast read, you only need to do two things: figure out how to materialize a view from history, and figure out how to incrementally update it as new events arrive. You shouldn’t need to touch the base storage schema at all. When a consumer is ripped out of the product, you just throw away their materialized views.

Viewed through that lens, everything you do in a browser is an event with a context and a timestamp: “the user bookmarked page X at time T in session S”, “the user visited URL X at time T in session S for reason R, coming from visit V1”. Store everything you know, materialize everything you need.

We built that with SQLite.

This was a clear and flexible concept, and it allowed us to adapt, but the implementation in JS involved lots of boilerplate and was somewhat cumbersome to maintain manually: the programmer does the work of defining how events are stored, how they map to more efficient views for querying, and how tables are migrated when the schema changes. You can see this starting to get painful even early in Tofino’s evolution, even without data migrations.

Quite soon it became clear that a conventional embedded SQL database wasn’t a direct fit for a problem in which the schema grows organically — particularly not one in which multiple experimental interfaces might be sharing a database. Furthermore, being elbow-deep in SQL wasn’t second-nature for Tofino’s webby team, so the work of evolving storage fell to just a few of us. (Does any project ever have enough people to work on storage?) We began to look for alternatives.

We explored a range of existing solutions: key-value stores, graph databases, and document stores, as well as the usual relational databases. Each seemed to be missing some key feature.

Most good storage systems simply aren’t suitable for embedding in a client application. There are lots of great storage systems that run on the JVM and scale across clusters, but we need to run on your Windows tablet! At the other end of the spectrum, most webby storage libraries aren’t intended to scale to the amount of data we need to store. Most graph and key-value stores are missing one or more of full-text indexing (crucial for the content we handle), expressive querying, defined schemas, or the kinds of indexing we need (e.g., fast range queries over visit timestamps). ‘Easy’ storage systems of all stripes often neglect concurrency, or transactionality, or multiple consumers. And most don’t give much thought to how materialized views and caches would be built on top to address the tension between flexibility and speed.

We found a couple of solutions that seemed to have the right shape (which I’ll discuss below), but weren’t quite something we could ship. Datomic is a production-grade JVM-based clustered relational knowledge store. It’s great, as you’d expect from Cognitect, but it’s not open-source and we couldn’t feasibly embed it in a Mozilla product. DataScript is a ClojureScript implementation of Datomic’s ideas, but it’s intended for in-memory use, and we need persistent storage for our datoms.

Nick and I try to be responsible engineers, so we explored the cheap solution first: adding persistence to DataScript. We thought we might be able to leverage all of the work that went into DataScript, and just flush data to disk. It soon became apparent that we couldn’t resolve the impedance mismatch between a synchronous in-memory store and asynchronous persistence, and we had concerns about memory usage with large datasets. Project Mentat was born.

Mentat is built on top of SQLite, so it gets all of SQLite’s reliability and features: full-text search, transactionality, durable storage, and a small memory footprint.

On top of that we’ve layered ideas from DataScript and Datomic: a transaction log with first-class transactions so we can see and annotate a history of events without boilerplate; a first-class mutable schema, so we can easily grow the knowledge store in new directions and introspect it at runtime; Datalog for storage-agnostic querying; and an expressive strongly typed schema language.

Datalog queries are translated into SQL for execution, taking full advantage of both the application’s rich schema and SQLite’s fast indices and mature SQL query planner.

You can see more comparisons between Project Mentat and those storage systems in the README.

A proper tutorial will take more space than this blog post allows, but you can see a brief example in JS. It looks a little like this:

// Open a database.  
let db = await datomish.open("/tmp/testing.db");

// Make sure we have our current schema.  
await db.ensureSchema(schema);

// Add some data. Note that we use a temporary ID (the real ID  
// will be assigned by Mentat).  
let txResult = await db.transact(\[  
  {"db/id": datomish.tempid(),  
   "page/url": "[https://mozilla.org/](https://mozilla.org/)",  
   "page/title": "Mozilla"}  
\]);

// Let's extend our schema. In the real world this would  
// typically happen across releases.  
schema.attributes.push({"name":        "page/visitedAt",  
                        "type":        "instant",  
                        "cardinality": "many",  
                        "doc":         "A visit to the page."});  
await db.ensureSchema(schema);

// Now we can make assertions with the new vocabulary  
// about existing entities.  
// Note that we simply let Mentat find which page  
// we're talking about by URL -- the URL is a unique property  
// -- so we just use a tempid again.  
await db.transact(\[  
  {"db/id": datomish.tempid(),  
   "page/url": "[https://mozilla.org/](https://mozilla.org/)",  
   "page/visitedAt": (new Date())}  
\]);

// When did we most recently visit this page?  
let date = (await db.q(  
  \`\[:find (max ?date) .  
    :in $ ?url  
    :where  
    \[?page :page/url ?url\]  
    \[?page :page/visitedAt ?date\]\]\`,  
  {"inputs": {"url": "[https://mozilla.org/](https://mozilla.org/)"}}));

console.log("Most recent visit: " + date);

Project Mentat is implemented in ClojureScript, and currently runs on three platforms: Node, Firefox (using Sqlite.jsm), and the JVM. We use DataScript’s excellent parser (thanks to Nikita Prokopov, principal author of DataScript!).

Addition, January 2017: we are in the process of rewriting Mentat in Rust. More blog posts to follow!

Nick has just finished porting Tofino’s User Agent Service to use Mentat for storage, which is an important milestone for us, and a bigger example of Mentat in use if you’re looking for one.

What’s next?

We’re hoping to learn some lessons. We think we’ve built a system that makes good tradeoffs: Mentat delivers schema flexibility with minimal boilerplate, and achieves similar query speeds to an application-specific normalized schema. Even the storage space overhead is acceptable.

I’m sure Tofino will push our performance boundaries, and we have a few ideas about how to exploit Mentat’s schema flexibility to help the rest of the Tofino team continue to move quickly. It’s exciting to have a solution that we feel strikes a good balance between storage rigor and real-world flexibility, and I can’t wait to see where else it’ll be a good fit.

If you’d like to come along on this journey with us, feel free to take a look at the GitHub repo, come find us on Slack in #mentat, or drop me an email with any questions. Mentat isn’t yet complete, but the API is quite stable. If you’re adventurous, consider using it for your next Electron app or Firefox add-on (there’s an example in the GitHub repository)… and please do send us feedback and file issues!

Acknowledgements

Many thanks to Lina Cambridge, Grisha Kruglov, Joe Walker, Erik Rose, and Nicholas Alexander for reviewing drafts of this post.

Different kinds of storage

Tue, 26 Apr 2016 16:59:29 +0000

I’ve been spending most of my time so far on Project Tofino thinking about how a user agent stores data.

A user agent is software that mediates your interaction with the world. A web browser is one particular kind of user agent: one that fetches parts of the web and shows them to you.

(As a sidenote: browsers are incredibly complicated, not just for the obvious reasons of document rendering and navigation, but also because parts of the web need to run code on your machine and parts of it are actively trying to attack and track you. One of a browser’s responsibilities is to keep you safe from the web.)

Chewing on Redux, separation of concerns, and Electron’s process model led to us drawing a distinction between a kind of ‘profile service’ and the front-end browser itself, with ‘profile’ defined as the data stored and used by a traditional browser window. You can see the guts of this distinction in some of our development docs.

The profile service stores full persistent history and data like it. The front-end, by contrast, has a pure Redux data model that’s much closer to what it needs to show UI — e.g., rather than all of the user’s starred pages, just a list of the user’s five most recent.

The front-end is responsible for fetching pages and showing the UI around them. The back-end service is responsible for storing data and answering questions about it from the front-end.

To build that persistent storage we opted for a mostly event-based model: simple, declarative statements about the user’s activity, stored in SQLite. SQLite gives us durability and known performance characteristics in an embedded database.

On top of this we can layer various views (materialized or not). The profile service takes commands as input and pushes out diffs, and the storage itself handles writes by logging events and answering queries through views. This is the CQRS concept applied to an embedded store: we use different representations for readers and writers, so we can think more clearly about the transformations between them.

Where next?

One of the reasons we have a separate service is to acknowledge that it might stick around when there are no browser windows open, and that it might be doing work other than serving the immediate needs of a browser window. Perhaps the service is pre-fetching pages, or synchronizing your data in the background, or trying to figure out what you want to read next. Perhaps you can interact with the service from something other than a browser window!

Some of those things need different kinds of storage. Ad hoc integrations might be best served by a document store; recommendations might warrant some kind of graph database.

When we look through that lens we no longer have just a profile service wrapping profile storage. We have a more general user agent service, and one of the data sources it manages is your profile data.

Trivial SQL ORMs considered harmful

Mon, 07 Mar 2016 17:18:24 +0000

Our team has a little “things I learned this week” tradition in our team meetings, and it just blossomed onto our mailing list (async is better!).

In one such post, Michael pointed to sqldelight, a library to automatically generate Android SQL-handling code for a typed schema and a set of queries.

I wrote a little screed advising caution, which Margaret suggested would make a good blog post… so here it is, unedited.

Note that I have nothing against automated schema and query checking, nor against saving error-prone typing; my primary objection here is to the object mapping.

Michael notes:

It’s a square library that allows you to define your tables & queries in a separate text file and it will auto-generate table creation and methods of querying. To do so, it creates Objects which represent the row of your DB.

and I reply:

At the risk of being a negative nelly: broadly speaking I find this kind of trivial ORM to be a terrible design anti-pattern, and I strongly discourage its use for anything but saving some typing before committing a v0. We implemented something like this on the iOS side of the house, and it was a huge pain in the ass to get rid of later.

If your system is simple enough that you’re putting whole objects in and getting whole objects out — that is, a simple ORM is a good fit — you should instead be not using SQLite. Serialize your objects to a flat file in JSON and keep them in memory. Up to about 100KB of data, it’s better in almost every way. (There are some exceptions, but they’re exceptions.)

For everyone else, your inputs and outputs will differ, or you’ll need more control, and so you should run screaming from sqldelight.

There are at least five reasons why I feel this way. I’ll stop at five to avoid writing an epic.

Database tables really come into their own when you join them: bookmarks against favicons, hockey players against teams and games. If you join them (particularly with left/outer/etc. joins), your ORM needs to bulk up the generated model objects with optional fields; it has to, otherwise it can’t represent the result of the join.

Those optional fields leak throughout your app — hey, is that favicon ID supposed to be set here? Does it need to be set to -1 sometimes? — and make your life unpleasant.
SELECT \* is an anti-pattern in database work. You might not need all of the fields, but requesting them all limits the indices that the storage layer can use. A smart storage engine can use compound indices to make some queries with limited projections very fast indeed. Or perhaps you want to get unique values.

To take sqldelight’s example, you should not SELECT \* FROM hockey\_player; if you need that, slurp a JSON file instead! When populating a list view, you probably want SELECT name, id FROM hockey\_player ORDER BY position. For a name picker you want SELECT DISTINCT name FROM hockey\_player UNION hockey\_officials. And so on.
Migrations are a reality when dealing with data storage. sqldelight doesn’t seem to address this at all.
Syncability (and backup, and export, and…) are also a reality. A sync system typically has a very different viewpoint on data storage than the frontend — not only does that mean you have a set of fields that only part of the application cares about (which screws up your ORM), it also often means that two parts of the system have utterly different conceptions of seemingly straightforward actions like “delete this thing”. ORMs are (almost by definition) one size fits none.
Getting SQL-based storage — hell, getting any kind of storage — right is hard. Concurrency, performance, memory usage, and correctness all involve careful attention. Take a read of the Sqlite.jsm docs or some of Firefox for iOS’s database prep code if you want a hint of this. Libraries that generate data access code can slip past this attention, and that’s a bad thing.

Syncing and storage on three platforms

Thu, 24 Dec 2015 18:31:09 +0000

As it’s Christmas, I thought I’d take a moment to write down my reflections on Firefox Sync’s iterations over the years. This post focuses on how they actually sync — not the UI, not the login and crypto parts, but how they decide that something has changed and what they do about it.

I’ve been working on Sync for more than five years now, on each of its three main client codebases: first desktop (JavaScript), then Android (built from scratch in Java), and now on iOS (in Swift). Desktop’s overall syncing strategy is unchanged from its early life as Weave. Partly as a result of Conway’s Law writ large — Sync shipped as an add-on, built by the Services team rather than the Firefox team, with essentially no changes to Firefox itself — and partly for good reasons, Sync was separate from Firefox’s storage components. It uses Firefox’s observer notifications to observe changes, making a note of changed records in what it calls a Tracker. This is convenient, but it has obvious downsides:

From an organizational perspective, it’s easy for developers to disregard changes that affect Sync, because the code that tracks changes is isolated. For example, desktop Sync still doesn’t behave correctly in the presence of fancy Firefox features like Clear Recent History, Clear Private Data, restoring bookmark backups, etc.
Sync doesn’t get observer notifications for all events. Most notably, bulk changes sometimes roll-up or omit events, and it’s always possible for code to poke at databases directly, leaving Sync out of the loop. If a Places database is corrupt, or a user replaces it manually, Sync’s tracking will be wrong. This is almost inevitable when sync metadata doesn’t live with the data it tracks.
Sync doesn’t track actual changes; it tracks changed IDs. When a sync occurs, it goes to storage to get a current representation of the changed record. (If the record is missing, we assume it was deleted.) This makes it very difficult to do good conflict resolution.
In order to avoid cycles, Sync stops listening for events while it’s syncing. That means it misses any changes the user makes during a sync.
Similarly, it doesn’t see changes that happen before it registers its observers, e.g., during the first few seconds of using the browser.

Beyond the difficulties introduced by a reliance on observers, desktop Sync took some shortcuts¹: it applies incoming records directly and non-transactionally to storage, so an interrupted sync leaves local storage in a partial state. That’s usually OK for unstructured data like history — it’ll try again on the next sync, and eventually catch up — but it’s a bad thing for something structured like bookmarks, and can still be surprising elsewhere (e.g., passwords that aren’t consistent across your various intranet pages, form fields that are mismatched so you get your current street address and your previous city and postal code).

During the last days of the Services team, Philipp, Greg, myself, and others were rethinking how we performed syncs. We settled on a repository-centric approach: records were piped between repositories (remote or local), abstracting away the details of how a repository figured out what had changed, and giving us the leeway to move to a better internal structure.

That design never shipped on desktop, but it was the basis for our Sync implementation on Android.

Android presented some unique constraints. Again, Conway’s Law applied, albeit to a lesser extent, but also the structure of the running code had to abide by Android’s ContentProvider/SyncAdapter/Activity patterns.

Furthermore, Fennec was originally planning to support Android’s own internal bookmark and history storage, so its internal databases mirrored that schema. You can still see the fossilized remnants of that decision in the codebase today. When that plan was nixed, the schema was already starting to harden. The compromise we settled on was to use modification timestamps and deletion flags in Fennec’s content providers, and use those to extract changes for Sync in a repository model.

Using timestamps as the basis for tracking changes is a common error when developers hack together a synchronization system. They’re convenient, but client clocks are wrong surprisingly often, jump around, and lack granularity. Clocks from different devices shouldn’t be compared, but we do it anyway when reconciling conflicts. Still, it’s what we had to work with at the time.

The end result is over-engineered, fundamentally flawed, still directly applies records to storage, but works well enough. We have seen dramatically fewer bugs in Android Sync than we saw in desktop Sync between 2010 and 2012. I attribute some of that simply to the code having been written for production rather than being a Labs project (the desktop bookmark sync code was particularly flawed, and Philipp and I spent a lot of time making it better), some of it to lessons learned, and some of it to better languages and tooling — Java and Eclipse produce code with fewer silly bugs² than JavaScript and Vim.

On iOS we had the opportunity to learn from the weaknesses in the previous two implementations.

The same team built the frontend, storage, and Sync, so we put logic and state in the right places. We track Sync-related metadata directly in storage. We can tightly integrate with bulk-deletion operations like Clear Private Data, and change tracking doesn’t rely on timestamps: it’s an integral part of making the change itself.

We also record enough data to do proper three-way merges, which avoids a swath of quiet data loss bugs that have plagued Sync over the years (e.g., recent password changes being undone).

We incrementally apply chunks of records, downloaded in batches, so we rarely need to re-download anything in the case of mid-sync failures.

And we buffer downloaded records where appropriate, so the scary part of syncing — actually changing the database — can be done locally with offline data, even within a single transaction.

Storage on iOS is significantly more involved as a result: we have sync_status columns on each table, and typically have two tables per datatype to track the original shared parent of a row. Bookmark sync is shaping up to involve six tables. But the behavior of the system is dramatically more predictable; this is a case of modeling essential complexity, not over-complicating. So far the bug rate is low, and our visibility into the interactions between parts of the code is good — for example, it’s just not possible for Steph to implement bulk deletions of logins without having to go through the BrowserLogins protocol, which does all the right flipping of change flags.

In the future we’re hoping to see some of the work around batching, use of in-storage tracking flags, and three-way merge make it back to Android and eventually to desktop. Mobile first!

My feeling is that Weave was (at least from a practical standpoint) originally designed to sync two desktops with good network connections, using cheap servers that could die at any moment. That attitude doesn’t fit well with modern instant syncing between your phone, tablet, and laptop! ↩︎
For example, Sync’s tab record format, defined by the desktop code, includes a time last used. Sometimes this is a string, and sometimes it’s an integer. Hooray JavaScript![/ ↩︎

About on rnewman

Camellia oil: the ideal shave

Tenets

Writing for an audience

First month at Stripe

Tip: Synergy lag/jerking on macOS

I'm back; things changed

A long pause

Thinking about Syncing, Part 4: keeping track

The needs of synchronization

Management of identifiers

Equivalence

Management of change

Detection of conflict

Consistency (in the database sense)

Eventual consistency (in the distributed system sense)

Quiescence

Atomicity

Incrementality

Continuation of service

A modest proposal

Thinking about Syncing, Part 3: separation of concerns

Events and tables

An abridged history of version control

Seeing similarities

Generalizing the argument: CQRS

Conclusion

Thinking about Syncing, Part 2: timelines and change

The state of things

Merging

Converging on change

Events

Looking back to look forwards

Aside: the role of the server

Conclusion

Thinking about Syncing, Part 1: timelines

Facets of a sync system

Timelines and merges

Can you record any permanent data before beginning to sync?

Can you record data while temporarily partitioned from the syncing infrastructure?

Can you stop using sync and keep your data?

Can a user rewrite history?

Conclusion

A conceptual introduction to Project Mentat

What is a database?

What is a schema?

What is event sourcing?

What is Datomic?

What is SQLite?

What is Project Mentat?

What next?

Introducing Project Mentat, a flexible embedded knowledge store

Acknowledgements

Different kinds of storage

Trivial SQL ORMs considered harmful

Syncing and storage on three platforms