The window is closing.

You’ve probably heard the buzz, right? AI is coming for people’s jobs, software engineers wielding these HAL-9000s need less teams anymore, yadda yadda. But here’s the thing – I think we might be living in the twilight of the golden age of software engineering. And if we’re smart, we’ll make the most of it while we still can.

Nightmare on Code Street

Let’s face it, the writing’s on the wall (or should we say, in the console?). These AI coding assistants are getting scary good. I was pair programming with ChatGPT the other day, and I swear that thing is wayyy better than me – if you’re a hiring manager reading this, I’m just kidding, I’m good enough to be hired sir 🫡

But jokes aside, this tech is advancing faster than I can learn those new bloaty JavaScript frameworks (and that’s saying something!)

The way I see it, we’re in this weird limbo. On one hand, there’s still a huge demand for software engineers. Companies are throwing money and perks at anyone who can solve a leetcode two sum. On the other hand, AI is nipping at our heels, threatening to turn coding into something more… accessible.

Every day, those AI models are getting smarter, more capable. They’re not just writing “Hello, World!” anymore – they’re slowly becoming more able to craft entire systems while we’re still trying to remember how to write a for-loop in that language

So here’s my hot take: This might be our last chance to cash in on the relative scarcity of coding skills. Think about it – right now, knowing how to code is like having a superpower (earning money!). But what happens when AI democratizes that superpower? Suddenly, the playing field gets a lot more crowded.

Don’t get me wrong, I’m not saying software engineering is going away. But the days of companies desperately hiring anyone who can cobble together some APIs might be numbered. We’re looking at a future where the bar for entry gets higher, and the competition gets fiercer.

We might be the last generation of programmers who get to experience the wild west of tech. The days of “move fast and break things” are numbered. Soon, it’ll be “move fast and let average joe prompt and shovel data to the AI to fix things.”

Now obviously I’m going full on FUD here to make a more spicy post. But the AI train is definitely coming, and it’s coming fast. But with some grit, some smarts, and a whole lot of caffeine, we might just be able to hop on board instead of getting run the hell over.

The Hunger Games

Now, with interest rates climbing and growth slowing, the party’s over. Companies are sobering up and realizing they don’t need hordes of average Joes writing cookie-cutter code. Nope, they’re looking for the cream of the crop – the kind of engineers who can innovate their way out of needing more engineers.

Enter the era of “efficiency.” (Yeah, Meta, I’m looking at you and your “year of efficiency.”) Layoffs are raining down like confetti at a tech conference afterparty. And let’s not forget our new AI overlords, ready to code circles around us mere mortals.

But here’s the real tragedy: most people haven’t gotten the memo. They’re still out there, grinding away on LeetCode, thinking they’re punching their ticket to Silicon Valley stardom. Sorry to burst your bubble, kids, but the gold rush is over.

Now, don’t get me wrong. We still need engineers. Hell, the world runs on software these days. But here’s the kicker: we don’t need as many as we thought we did. See, it turns out that when you build software that can scale to billions of users, you don’t need to keep hiring engineers at the same rate. It’s almost like… efficiency is a thing? Who’d have thought it?

So now we’re in this weird twilight zone where the industry is sobering up from its growth binge, realizing it’s got a software engineer hangover. All these companies that were hiring anyone who could spell “JavaScript” are now looking around and going, “Wait, why do we have 47 people working on our login page?”

And let me tell you, it’s not just the quantity that’s changing - it’s the quality. The bar is getting higher, folks. It’s not enough to be able to center a div anymore (although let’s be honest, that’s still black magic to half of us). Now you need to be able to do it while juggling five different frameworks, two cloud platforms, and a partridge in a pear tree.

And for those of you just starting out? Hoo boy. You’ve got a tough road ahead. Sure, there’s a wealth of knowledge out there, but so is the ratio of jobs:applicants, the market has flipped from mostly dominated by young talent to now mostly mids and seniors. You’ll be fighting tooth and nail for fewer jobs against increasingly fierce competition.

As the wise and slightly terrifying Charlie Munger once said:

“I think value investors are going to have a harder time now that there’s so many of them competing for a diminished bunch of opportunities.”

The cold, hard truth is that future generations of devs are going to have to work harder for less reward. All that “wisdom” we’ve been accumulating? It’s either going to become the bare minimum or completely obsolete.

Now, I’m not saying innovation is dead. There will always be new frontiers, new technologies, new problems to solve. Heck, I can think of a dozen ways we could improve developer productivity right now. But the days of easy pickings are over. It took decades to go from the birth of the internet to the iPhone. Who knows how long it’ll be before the next big thing comes along?

So, unless you’re the kind of coding savant who makes Linus Torvalds look like a script kiddie, brace yourself. The tech world isn’t going to be handing out golden tickets like it’s Willy Wonka’s chocolate factory anymore. Welcome to the new normal, folks. It’s time to level up or get left behind.

Carpe Diem (or Carpe Code-iem?)

So what’s a code monkey to do? Well, let me tell you what I’m doing, and you can decide if you want to come along for the ride.

Proper Fundamentals

First off, forget all that theoretical mumbo-jumbo you’ve been cramming into your skull. You can’t claim to have “proper fundamentals” unless you’ve actually built stuff. Real stuff. Messy, broken, frustrating stuff. Theory without practice is like a bicycle without wheels - utterly useless. I just rewatched Oppenheimer and I’ve always loved it whenever he says “Theory can only take you so far.”

Want a crash course in how computers actually work? Go do nand2tetris. It’ll make you appreciate every single abstraction we’ve built over the years. But that’s really optional, what’s not, in my opinion, is grinding Think Python. It’s a book that in my opinion, everyone should go through to learn how to think about programming. And after learning some python, work through some problems from LeetCode with NeetCode. Sure, neetcode basically tech interview porn, but it’ll give you a taste of the algorithmic gymnastics employers expect these days.

And now here’s what you’re gonna do: pick a project - any project - and build the damn thing. I don’t care if it’s been done a million times before. Hell, start with “Hello, World!” if you have to, but don’t stop there. Keep pushing until you hit that wall where everything falls apart and you’re ready to set your computer on fire. That, my friends, is where the real learning begins.

Now, let’s talk about becoming a “strong” developer. Buckle up, because this is where it gets messy. There’s no straightforward answer, no magic bullet. Being a good dev means being decent at a whole bunch of different things. But since you’re all looking at me with those puppy dog eyes, here’s my very opinionated take on the top list of dev skills:

SQL. Learn it, love it, dream in queries.
Algorithms and data structures. Because sometimes, brute force just won’t cut it.
Naming things. Seriously, half of programming is just coming up with good names.
Balancing act: Make it fast, make it short, make it readable. Pick two (if you’re lucky).
Refactoring. Because your first attempt will always suck.
Test-driven development. Write tests, or forever be plagued by bugs.
Systems-level voodoo: memory, processes, threads, deadlocks. It’s all fun and games until your app crashes and burns.
Security. Please, for the love of all that is holy, don’t store passwords in plain text.
Conventions. Follow them, or be forever shunned by your peers.
Source control. Git good or get out.
Project management. Breaking big problems into small, digestible chunks is an art form.
Developer ergonomics. Your wrists will thank you later.

Now, I know what you’re thinking: “That’s too much! I’ll never learn it all!” Well, man, this field is vast and incomprehensible, and the sooner you accept that, the happier you’ll be. Just keep chipping away at it, one day at a time.

Want some benchmarks? Try solving LeetCode 75 problems in 30 minutes each. Sketch out the architecture for YouTube or Twitter without breaking into a cold sweat. Build a simple app in your chosen stack in an hour. If you can do all that, you’re… well, you can impress anyone anywhere.

And for crying out loud, learn how to actually solve problems:

Understand the user’s perspective. Don’t build solutions to problems that don’t exist.
Learn the lingo. If you can’t talk the talk, you’ll never walk the walk.
KISS: Keep It Simple, Stupid. The fanciest solution is rarely the best one.
Know your data. It’s the foundation of everything you’ll build.
Technical debt is like credit card debt - a little is fine, too much will ruin you.

There you have it, folks. The unvarnished, ugly truth about what it takes to be a halfway decent developer. It’s a long, painful road full of late nights, caffeine overdoses, and existential crises. But hey, at least the pay is still good (for now).

System Design

And from there we’ll want to dive deep into the guts of systems. I’m talking distributed systems, concurrency, the nitty-gritty of how computers actually work. Because let me tell you, AI might be able to spit out a React component faster than you can google what to npm install, but it still struggles with the really gnarly stuff. You know, the kind of problems that make you question your life choices at 3 AM while you’re knee-deep in core dumps.

Sure, GPT and copilot can spit out code faster than you can type it. But understanding the intricate dance of distributed systems, the delicate balance of consistency and availability, the art of designing systems that scale to millions of users? That takes a human touch, my friend.

So here’s my advice: Don’t just study this stuff. Live it. Breathe it. Build things. Break things. Then figure out why they broke and build them again, but better. Set up a Kubernetes cluster in your basement. Write your own distributed key-value store. Implement Paxos (and then cry a little, because let’s face it, Paxos is a pain).

And most importantly, think about how all of this fits together. Because that’s where the real magic happens. Understanding how to design systems that are scalable, reliable, and efficient? That’s the kind of expertise that’ll keep you employed long after AI has taken over writing CRUD apps.

Remember, in the world of distributed systems, eventual consistency isn’t just a concept - it’s a way of life. So keep pushing, keep learning, and always design with failure in mind.

Start small, but think big. I’ll give you some project ideas.

Maybe you begin with a simple load balancer. But don’t stop there. Keep adding layers. How does it handle failure? What happens when you introduce caching? How do you ensure data consistency across multiple regions?

And here’s a pro tip: Document everything. Not just your code, but your thought process. Why did you choose this particular architecture? What trade-offs did you make? Because let me tell you, future you will thank past you when you’re trying to debug some weird edge case six months down the line.

Remember, in this game, there’s no such thing as a perfect system. It’s all about trade-offs. CAP theorem isn’t just some academic concept - it’s the harsh reality we live with every day. So get comfortable with making those hard choices. Should you optimize for consistency or availability? How much complexity are you willing to introduce for that extra bit of performance?

Build your own distributed key-value store. Start simple with a single-node in-memory store, then add persistence. Now the fun begins - shard it across multiple nodes. How will you handle consistency? Implement vector clocks. Now make it fault-tolerant. Congratulations, you’ve just built a mini-Cassandra!

Implement your own consensus algorithm. Start with Raft - it’s like Paxos but actually comprehensible by mere mortals. Build a cluster of nodes that can elect a leader and replicate a log. Then throw in some network partitions and see how it handles them.

Build a distributed file system. Start with a basic client-server model, then shard your files across multiple nodes. Implement replication for fault tolerance. How will you handle concurrent writes? Can you implement something like HDFS’s write-once, read-many semantics?

Build a toy search engine. Crawl a subset of the web, index it, and implement distributed search. How will you parallelize the crawling? How about distributed indexing? Can you implement something like PageRank? How will you deploy it?

Building these systems from scratch will give you insights that no amount of textbook reading or system design blog post/youtube video would ever could. You’ll understand viscerally why certain design decisions were made in real-world systems, and from the trials of implementing them, can see the failure scenarios that you might not have thought of.

Plus, imagine walking into your next job interview bragging about that scalable distributed search engine you built. You’ll be speaking the same language as the senior engineers, because you’ve wrestled with the same problems they have: building and deploying multiple nodes to AWS, processing large scale data sets in the cloud, implementing graph processing practically, probably utilize something like kafka for data pipelines, optimizing throughput as you work on parallelizing your crawling, and more

Here’s a further list of projects that are on my todo

Data Infra

ETL Pipeline Generator for Data Warehouse Design: design a tool that automatically proposes an ETL (Extract, Transform, Load) pipeline to convert a normalized operational database into a star schema. Add additional functionalities to maintain the star schema and potentially incorporate other data sources.
Workload-Aware Index and View Optimization System: create a system that recommends optimal indices based on specific workloads and database statistics. Alternatively, explore algorithms for determining the most effective materialized views in data warehouse systems like Redshift.
LLM-Powered Database Management Assistant: leverage GPT-4 or similar large language models to develop a system that aids in the setup and optimization of PostgreSQL or MySQL databases. Investigate effective prompting strategies, including system state and error message inputs, and explore methods for user interaction through a chat interface.
Natural Language to SQL Query Converter: implement a system that translates natural language text or speech into SQL queries for a specified database schema. Focus on a specific subset of queries to ensure feasibility. Evaluate the effectiveness of LLMs for this task, comparing their performance with existing libraries.
Generative AI-Database Integration Framework: explore novel database functions (e.g., UDFs, synopses) to facilitate the use of stored data in generative AI applications like GPT-4. Investigate potential SQL extensions to streamline prompt engineering processes.
Vector-Embedded Database Search with LLM Integration: research methods for embedding databases using vector representations (e.g., BERT, TaBERT) and combining them with LLMs to enable efficient search capabilities in large-scale database systems.
Cross-Database Query Orchestration Agent: develop an LLM-based system to simplify operations across multiple database systems. Investigate the feasibility of using models like GPT-4 to determine data location, route queries appropriately, process cross-database queries, and identify query limitations.
Web-based Schema Population Engine: create a tool that, given a user-defined ER schema, automatically searches the web for relevant data to populate it. Utilize APIs like Google’s data search to find and integrate appropriate information into the schema.
Approximate Query Processing for Data Warehouses: investigate the use of random sampling techniques to evaluate SQL queries on read-only databases. Analyze potential performance improvements and develop methods to provide approximation guarantees.
Automated Data Augmentation System: build a system that automatically identifies and integrates related datasets to enhance existing data. For example, augmenting energy consumption time series data with corresponding temperature data.
Distributed System Configuration Auto-Tuner: develop a tool to automatically optimize configuration parameters for distributed systems like Apache Spark or traditional databases (e.g., Postgres, MySQL). Focus on achieving optimal performance for specific query workloads such as TPC-H.
Secure Multi-Party Database Query Framework: implement a system using secure multiparty computation techniques to enable multiple parties with partial database views to query the combined dataset without compromising data privacy.
Machine Learning-based Query Performance Predictor: collect query execution time data and train a machine learning model (e.g., CNN) to predict the performance of new queries. Alternatively, develop an ML-based system to estimate remaining query execution time in database systems like Postgres.
In-Database vs. External Analytics Performance Analyzer: compare the performance of statistical and machine learning operations executed within a database against the same operations performed on exported data using external tools like MATLAB. Reference projects like MADlib for in-database analytics.
Optimized Data Parsing Engine: investigate novel approaches to parsing CSV, JSON, and text data into database systems. Draw inspiration from papers such as “Filter Before You Parse” and “Mison: A Fast JSON Parser for Data Analytics” to develop efficient parsing strategies.

Distributed Systems

Decentralized Social Media Aggregation Platform: design and implement a distributed, fault-tolerant system similar to Reddit, focusing on decentralized architecture to enhance resilience and user autonomy. Address challenges such as consensus mechanisms, content moderation in a decentralized environment, and efficient data propagation across nodes.
Fault-Tolerant Node.js Application Framework: develop a system that enhances the fault tolerance of Node.js applications through replicated execution or other robust distributed systems techniques. Consider implementing features like state replication, automatic failover, and consistency management to ensure continuous operation in the face of node failures or network partitions.
Hybrid Big Data and Real-Time Processing Engine: create a data processing system that efficiently handles both batch processing of large datasets (similar to MapReduce and Spark) and real-time, online processing (akin to key-value stores or SQL databases). Focus on designing an architecture that seamlessly integrates these two paradigms, addressing challenges such as data consistency, resource allocation, and query optimization across different processing modes.
Unified Real-Time Collaboration Platform: develop a comprehensive platform integrating advanced text, voice, and video communication with real-time collaboration features, AI-enhanced video compression, and seamless integration with productivity tools.
Distributed Object-Oriented Network Framework for C++: design and implement a flexible, efficient object system enabling seamless network communication in C++, with features for serialization, deserialization, and remote method invocation.
Optimized Log-Structured Distributed Storage System: create an enhanced distributed log-structured storage system addressing Porcupine’s limitations, focusing on improved fault tolerance, reduced write amplification, and optimized performance in large-scale deployments.
Distributed Protocol Fault Injection and Verification Framework: build a comprehensive tool for systematically testing and verifying distributed protocols under various failure scenarios, ensuring robustness and correctness.
Consensus Algorithm Cross-Implementation Validator: develop an infrastructure for testing multiple implementations of consensus algorithms against each other, identifying protocol discrepancies and ensuring consistent behavior.
Modern Viewstamped Replication Protocol Implementation: implement an updated version of the viewstamped replication protocol based on the latest research, designed for easy integration into existing distributed systems.
Dependency-Aware Distributed Build System: develop an intelligent distributed parallel build tool that infers true dependencies through system call interception, optimizing for correctness and parallel execution.
Fault-Tolerant Distributed File System: build a large-scale distributed file storage system incorporating RAID-like redundancy mechanisms for data reliability and scalability.
Scalable Storage Virtualization Infrastructure: create a scalable virtual disk system inspired by Petal, leveraging modern tools to provide efficient storage virtualization for large-scale applications.
Lightweight Distributed Lock Service: develop a simplified version of Google’s Chubby, focusing on providing a reliable and highly available locking mechanism for distributed systems.
Decentralized Object Storage with Consensus-based Metadata Management: deesign a version of MogileFS that eliminates the centralized database by using consensus algorithms for metadata replication.
Consistent Hashing-based Distributed Web Cache: build a scalable web caching system using consistent hashing for even distribution of cache entries across multiple nodes.
Consensus-driven Highly Available DNS Server: implement a replicated DNS server using consensus algorithms to ensure consistency of updates and reliable domain name resolution.
Cross-Node Distributed Systems Debugger: develop a parallel debugger supporting distributed systems by tracking execution flow across message passing and synchronization points.
Distributed Systems Performance Analysis Tool: create a profiling tool providing detailed insights into performance bottlenecks in distributed systems.
Transparent Service Replication Framework: build a system-call or message-level interposition library for transparent replication of networked services, ensuring fault tolerance.
Granular Network Security Middleware: develop a message-level interposition library adding fine-grained security features to existing networked services.
Multi-Device File Synchronization Framework: create an efficient and reliable file synchronization tool for multiple devices and platforms.
Byzantine Fault-Tolerant Consensus Algorithm: design and implement a version of a consensus algorithm capable of tolerating Byzantine faults.
Interactive Consensus Protocol Visualization Tool: build a visualization tool like raftscope for understanding and debugging complex consensus protocol behaviors
Cryptographically Secure Distributed Contact Tracing System: develop a privacy-preserving mobile application for contact tracing, leveraging advanced cryptographic techniques to securely track disease spread without compromising individual privacy.

Learn to Code with AI

I’m also getting cozy with our new AI overlords. Not in a “please don’t replace me” way, but in a “let’s tango” kind of way. I’m learning how to whisper sweet nothings into GPT’s ear to make it do my bidding. Prompt engineering isn’t just for the ML crowd anymore, my friends. It’s a survival skill.

Imma Niche

And let’s talk about specialization. Remember when being a “full-stack developer” was all the rage? Well, now it’s time to pick your niche and dig in like your career depends on it (because it does). Maybe it’s security, or high-performance computing, or wrangling ancient COBOL systems that run our financial infrastructure. Find something that AI can’t easily replicate and become the go-to human for it.

I wanna niche myself in the intersection of infra and AI. I’ve already explained some systems/infra project earlier so now I’ll give you some of my project ideas concerning AI.

Let’s code our own GPT. Yeah, that’s right. I want you to create a baby version of those fancy language models everyone’s yakking about. Start small, maybe just predicting the next character in a string. Then work your way up to finishing sentences, paragraphs, and eventually, writing your own stand-up comedy routine. Bonus points if it’s actually funny.

Or how about a snarky code reviewer? Create an AI that reviews code like that one colleague we all have. You know, the one who makes you question your life choices with every pull request. Make it sassy, make it smart, and of course, make it catch those off-by-one errors.

Here’s a few more

Slack-based Data Visualization Recommender: develop a tool that integrates with Slack to analyze conversations and suggest appropriate visualizations. For example, when team members discuss sales figures for a specific region, the system would propose various visual representations of the sales data.
Speech-to-Structured Query Converter: create a system that enables structured querying of spoken text data. This tool would process recorded call center conversations, allowing analysts to pose questions like “What’s the frequency of inquiries about topic X?” or “How many callers are from California?” The system would use an analyst-defined ER schema to automatically categorize and structure the audio data.
Web-based Schema Population Tool: design a tool that, given a user-defined ER schema, automatically searches the web for relevant data to populate it. For example, if a user creates a schema with [temperature, state, date], the system would utilize Google’s data search API to find and populate the schema with appropriate information.
LLM-driven Data Science Task Optimizer: leverage large language models like ChatGPT to automate data science tasks efficiently. Focus on minimizing the number of LLM invocations or implementing a cascade of models ranging from local, cost-effective options to more sophisticated, resource-intensive ones.
Vector Database Performance Benchmark: investigate the trade-offs between various vector storage and search methodologies. Evaluate the scalability of RAG systems like Pinecone, determine the limitations of exhaustive vector search, and compare different vector stores in terms of performance and accuracy. Consider developing a VectorStore benchmark, referencing the integrations listed in the LlamaIndex documentation.
Efficient Raw Data Parsing Framework: explore innovative techniques for parsing CSV, JSON, and text data into databases more efficiently. Draw inspiration from papers like “Filter Before You Parse” and “Mison: A Fast JSON Parser for Data Analytics” to develop novel parsing strategies.
Cross-platform Analytics Integration System: create a tool that streamlines analytics across various data storage formats and platforms, including structured CSV data in blob storage (e.g., S3 Parquet format), data warehouses (e.g., Redshift), and relational databases (e.g., PostgreSQL).
Machine Learning-based Wikipedia Data Extractor: develop a supervised learning-based system that can extract structured data from Wikipedia articles, focusing on inferring the content of infoboxes from the surrounding text.
Computer Vision-based Map Metadata Extractor: build a system that uses computer vision techniques to extract metadata from imagery, such as identifying speed limits from Google Street View images, to enhance digital maps like OpenStreetMap. Utilize existing tools like Google Cloud Vision API for text extraction and object identification.
Approximate Join Algorithm for Unmodified Databases: research and implement techniques for approximately executing database joins without modifying the underlying database system. Use the WanderJoin algorithm (as described in the paper from SIGMOD ‘16) as a starting point for this investigation.
Automated Time Series Forecasting Framework: create a comprehensive system to simplify the time series forecasting process, encompassing data integration, automatic parameter tuning, and user interface design. Consider using Facebook’s Prophet library or similar tools as a foundation for this project.

Learn How to Rizz 😈🐺

And lastly, the real secret sauce - and I can’t believe I’m sharing this, but we’re all friends here, right? Learn how to talk. Rizz baby. Be a rizzler. I know, I know, it’s scary. We’d all rather nest our for-loops than go to a meeting with actual people. But trust me, the ability to translate geek-speak into something the suits upstairs can understand? That’s gold, baby. Pure gold.

So fire up those IDEs, crack open a fresh energy drink, and let’s make some magic happen. Learn the deep stuff, dance with AI, find your niche, and for the love of all that is holy, learn to communicate with non-programmers

The Final Commit

Look, I’m not trying to be all doom and gloom here (although I admitted that I’m FUDing cause I’m a lame writer), but I find that such is the kind of perpsective to give you a kick in the nuts that’ll make you stand up and fight back. The future of software engineering is likely going to be awesome in ways we can’t even imagine. But if you’re like me, still learning the ropes, then we need to act fast.

This is our moment, our last dance in the era where “I know how to code” is enough to open doors. So let’s make it count. Learn, build, connect, and position ourselves to ride this wave of change instead of being swept away by it.

Who knows? Maybe in a few years, we’ll be the grizzled veterans, regaling newbies with tales of the wild west days of software engineering. “Back in my day, we had to write our own for-loops, uphill both ways!”

So fellow code monkeys, shall we dance?

If you like this post, you can say thanks by following me on 𝕏 @0xj0hn so I can look legit enough to slide my DMs into @taylorswift13

The Code Monkey's Last Dance