Yes, you should use SQLite: Notion's creative SQLite usage for 20% faster loads
(6 minute read) Thanks to WebAssembly, SQLite is even more magical on the web
Stacksweep is a software engineering refresher newsletter, with applicable takeaways for the job and refreshers to stay current quickly.
Most developers, when thinking about what database to use, think about MySQL/Postgres (relational) or Mongo/Dynamo (NoSQL). For many production grade applications, it’s probably a good idea to use one of these for the foundation of your database.
But it’s not always necessary. Kent Dodds argues that you could probably just use SQLite in production. Regardless of whether you agree or not, the points he makes in favor of SQLite are valuable:
zero latency
simplified setup
easy multi-instance replication
can handle massive databases (more than you think
much easier to develop and test with.
But in reality, SQLite simply has tradeoffs. My favorite quote from the founder of SQlite is simply that: "Think of SQLite not as a replacement for Oracle but as a replacement for fopen().".
And when you use it right - you can get some amazing results! Notion used SQLite client-side to reduce their web app’s page navigation latency by 20%!
Before we begin, I want to note that this is a short summary, with commentary and useful links, of Notion’s original blog post: How we sped up Notion in the browser with WASM SQLite. My job here is to distill the key points and relate the general, applicable takeaways. You can skip to the “What you should know” section at the bottom to read this first and try the “Test Your Knowledge” Quiz.
SWE Quiz
SWE Quiz is the easiest way to build your system design fundamentals through 500+ questions on concepts like authentication, databases, and more.
SWE Quiz has crash courses to get up to speed quickly, smart quizzes to fill in gaps of knowledge, interview readiness scores, and concept roadmaps. It is offering a LIFETIME payment for a limited time and new content is being added weekly.
Background
Notion implemented SQLite caching in their desktop apps in 2021, resulting in:
50% faster initial page loads
50% faster navigation between pages
Inspired by this success, Notion's team decided to bring these performance gains to their web app using WebAssembly (WASM) SQLite.
How Notion did it
Each Notion tab in your browser has its own Web Worker dedicated to writing to SQLite. They used WASM SQLite, a version of SQLite that has been compiled to WebAssembly (WASM) for use in web browsers.
Refresher: Web Workers provide a straightforward mechanism for web content to execute scripts in background threads. These worker threads can carry out tasks without disrupting the user interface. Moreover, they have the capability to perform network requests using APIs like fetch()
or XMLHttpRequest
. After a worker is initiated, it can communicate with its creator JavaScript code by sending messages to a designated event handler, and this communication works both ways.
This allows for background processing without affecting the UI thread.
To prevent database corruption from concurrent writes, Notion uses a SharedWorker and the Web Locks API.
A SharedWorker is a type of Web Worker that can be accessed by multiple scripts or windows from the same origin, allowing shared processing and data across different parts of a web application. Unlike dedicated workers, which are tied to a single script, SharedWorkers can be utilized by various scripts simultaneously
Only the "active tab" can write to the SQLite database at any given time.
The SharedWorker keeps track of the active tab and redirects database write queries from non-active tabs to the active tab's Web Worker. This maintains data integrity.
Challenges
This wasn't necessarily EASY. Notion faced some challenges along the way:
Initial blocking of page loads by WASM SQLite
Slow disk reads on older devices
Database corruption due to concurrency issues
Solving the initial blocking of page loads
They optimized WASM SQLite loading by making it asynchronous, which improved initial page load times at the cost of not using SQLite for the very first page load.
Fixing slow disk reads
To address slow disk reads, Notion implemented a "race" between disk cache and network requests. This ensures the fastest option is always used, regardless of device performance.
Here’s another way of looking at the “racing” of network requests:
Database corruption due to concurrency issues
They solved this using the SharedWorker approach mentioned above:
The Results
The result of all these optimizations?
A 20% improvement in navigation times without negatively impacting other metrics.
Notion’s original blog post: How we sped up Notion in the browser with WASM SQLite
What you should know
From this, the important things to know are:
What WebAssembly is. This is an example of how WebAssembly is an improvement on the current state of the web.
Definition: A binary instruction format for efficient, low-level code execution in web browsers
What to know: WASM allows running high-performance, compiled languages (like C, C++, Rust) in the browser, enabling complex applications and games to run at near-native speed on the web
What Web Workers are.
Definition: Scripts that run in the background, separate from the main web page script
What to know: Use Web Workers for computationally intensive tasks to prevent UI freezing and improve overall application responsiveness
What SharedWorkers are.
Definition: A type of Web Worker that can be accessed by multiple scripts or windows from the same origin
What to know: Useful for shared processing and data across different parts of a web application. Allows for better resource usage and proper data syncing.
What "asynchronous loading" is.
Definition: Loading resources or executing code without blocking other operations or UI rendering
What to know: Implement asynchronous loading to improve perceived performance and user experience, especially for resource-intensive applications or slow network conditions
Test Your Knowledge
You're tasked with optimizing a web-based video editing application that's experiencing performance issues. The app allows users to apply complex filters and effects to high-resolution videos in real-time. Currently, the app freezes during render operations and struggles with large file uploads. Which combination of technologies and approaches would best address these issues?
A) Use Web Workers for video rendering, WebAssembly for implementing complex video filters, and asynchronous loading for file uploads.
B) Implement SharedWorkers for video rendering, use JavaScript for video filters, and synchronous loading for file uploads.
C) Utilize WebAssembly for video rendering, Web Workers for implementing video filters, and synchronous loading for file uploads.
D) Use SharedWorkers for file uploads, JavaScript for video rendering, and WebAssembly for implementing video filters.
Correct Answer: A
This solution addresses the main issues:
Web Workers for video rendering prevents UI freezing during complex render operations.
WebAssembly allows for high-performance implementation of complex video filters, potentially using existing C/C++ libraries compiled to WASM.
Asynchronous loading for file uploads improves the user experience by not blocking other operations during large file transfers.
Option B is incorrect because SharedWorkers aren't necessary for this single-tab application, JavaScript is less efficient for complex filters, and synchronous loading would block the UI during uploads.
Option C misuses WebAssembly and Web Workers, and synchronous loading would still cause UI blocking.
Option D misapplies SharedWorkers, uses less efficient JavaScript for rendering, and while WebAssembly for filters is good, the overall architecture is suboptimal.
If you liked this quiz, you should check out SWE Quiz.
> can handle massive databases (more than you think)
Do you have experience with properly making sqlite scale? Before, I did some testing with a 50 gig database with something like 300 million records (of 6-8 text columns) and it chugged along in selects