Infrastructure in '23
Wednesday, February 22nd 2023
For the third year running, I set aside some time at the beginning of the year to share what I believe to be the most dynamic and important areas of innovation in infrastructure. If you share my interest in any one or more of these areas, I would love to hear from you.
While I have written previously about the rise of serverless computing, I was slow to appreciate the role Javascript would play in pushing it forward. Javascript is the only language that lives up to “write once, run anywhere.” It has the most vibrant ecosystem of any language on the planet, unmatched startup times, and is secure enough to run untrusted code on behalf of users without modification or special tooling.
![](https://api.funkhaus.us/wp-content/uploads/2023/06/image-92.jpeg)
There is also a clear plurality of engineers who rely on it as their primary language. Thus, it’s hardly a coincidence that the emerging serverless compute players found success with Javascript developers. Their products present credible alternatives to AWS for hosting web apps, notably via the same sandboxing technology that powers Google Chrome - V8. Javascript engines like V8 do not require operating system virtual machines or containers to run, and offer unique performance and security advantages. They also run non-JS code via WebAssembly. Javascript’s dominance is furthered by the way it unifies front-end and back-end development with Typescript, and frameworks like NextJS and Remix. At the same time, runtimes like Bun and Deno are improving upon its usability and performance.
![](https://api.funkhaus.us/wp-content/uploads/2023/06/image-93-1.jpeg)
While the footprint of web apps written in other languages remains vast, new platforms like Fly are challenging the major clouds with a similar, albeit container-based approach for this audience. I believe the shift in developer preference toward these platforms will become more apparent this year, and continue to grab the attention of startups seeking to reimagine core layers of the application stack for the era of serverless compute.
![](https://api.funkhaus.us/wp-content/uploads/2023/06/image-94.jpeg)
However, the space seems too important for innovation to stop here. The emergence of workflow systems raises the question of how state management will evolve in this part of back-end. Today, workflow systems connect to a database service that handles state. As workflows systems become widely adopted, there appears to be opportunity to better customize databases and persistence layers to support their requirements. One potential outcome could be that workflow systems bundle state management into their cloud offerings to differentiate. It seems equally possible that database vendors will explore ways to tailor their offerings to better position for this increasingly strategic workload. I expect this to be an active design space this year, and am excited to see what comes of it.
![](https://api.funkhaus.us/wp-content/uploads/2023/06/image-95.jpeg)
In the unbundled OLAP architecture, data is stored directly in object storage like S3 or GCS. Indexing is handled by open-source formats like Hudi and Iceberg, which then structure and provide transactional guarantees over the data to be queried by a distributed query engine like Trino, or in-process with DuckDB. This allows for the right storage, indexing, and querying technologies to be applied to each use case on the basis of cost, performance, and operating requirements. I’ve found it easy to underestimate the power of “ease of use” in infrastructure, which is why I’m particularly excited by DuckDB’s in-process columnar analytics experience. At the same time, open-source projects like Datafusion, Polars, and Velox are making it possible to develop query engines for use cases that were previously considered “too niche” to build for. As the industry standardizes on Arrow for in-memory data representation, the challenge of how data is shared across these new platforms is solved. I expect this will lead to rapid innovation in analytical databases, by commoditizing the approach to query-execution that was a major driver of Snowflake’s success.
![](https://api.funkhaus.us/wp-content/uploads/2023/06/image-96.jpeg)
As we learn to wield foundation models in useful ways, questions about how to integrate them into software naturally emerge. Natural language “prompting” as a UX breeds new and interesting challenges for developers. The opportunity to build new infrastructure and tools that make it easier to build with language models is increasingly clear, and has become the single most active design space in infrastructure over the past year. As compute platforms tailored to web development emerged, so will those for AI developers. I believe the winning platform will offer strong Python support, a seamless experience between the user’s local environment and cloud, the ability to scale-out quickly, serverless access to GPUs, and integration with existing data infrastructure and tools. Vector databases built for storage and retrieval of embeddings are another exciting area. OpenAI’s embeddings API makes it easy to build semantic search applications with proprietary data, which should drive demand for them. I also see tremendous, albeit rapidly evolving, opportunity in orchestration. Projects like Langchain and LlamaIndex help developers integrate the process of “prompting” language models into their applications. While multi-modal models with larger context windows may simplify things, I believe we are only scratching the surface of the intelligence and automation that language models will bring to every aspect of software development.
— Bucky