The window is closing.
Youāve probably heard the buzz, right? AI is coming for peopleās jobs, software engineers wielding these HAL-9000s need less teams anymore, yadda yadda. But hereās the thing ā I think we might be living in the twilight of the golden age of software engineering. And if weāre smart, weāll make the most of it while we still can.
Nightmare on Code Street
Letās face it, the writingās on the wall (or should we say, in the console?). These AI coding assistants are getting scary good. I was pair programming with ChatGPT the other day, and I swear that thing is wayyy better than me ā if youāre a hiring manager reading this, Iām just kidding, Iām good enough to be hired sir š«”
But jokes aside, this tech is advancing faster than I can learn those new bloaty JavaScript frameworks (and thatās saying something!)
The way I see it, weāre in this weird limbo. On one hand, thereās still a huge demand for software engineers. Companies are throwing money and perks at anyone who can solve a leetcode two sum. On the other hand, AI is nipping at our heels, threatening to turn coding into something moreā¦ accessible.
Every day, those AI models are getting smarter, more capable. Theyāre not just writing āHello, World!ā anymore ā theyāre slowly becoming more able to craft entire systems while weāre still trying to remember how to write a for-loop in that language
So hereās my hot take: This might be our last chance to cash in on the relative scarcity of coding skills. Think about it ā right now, knowing how to code is like having a superpower (earning money!). But what happens when AI democratizes that superpower? Suddenly, the playing field gets a lot more crowded.
Donāt get me wrong, Iām not saying software engineering is going away. But the days of companies desperately hiring anyone who can cobble together some APIs might be numbered. Weāre looking at a future where the bar for entry gets higher, and the competition gets fiercer.
We might be the last generation of programmers who get to experience the wild west of tech. The days of āmove fast and break thingsā are numbered. Soon, itāll be āmove fast and let average joe prompt and shovel data to the AI to fix things.ā
Now obviously Iām going full on FUD here to make a more spicy post. But the AI train is definitely coming, and itās coming fast. But with some grit, some smarts, and a whole lot of caffeine, we might just be able to hop on board instead of getting run the hell over.
The Hunger Games
Now, with interest rates climbing and growth slowing, the partyās over. Companies are sobering up and realizing they donāt need hordes of average Joes writing cookie-cutter code. Nope, theyāre looking for the cream of the crop ā the kind of engineers who can innovate their way out of needing more engineers.
Enter the era of āefficiency.ā (Yeah, Meta, Iām looking at you and your āyear of efficiency.ā) Layoffs are raining down like confetti at a tech conference afterparty. And letās not forget our new AI overlords, ready to code circles around us mere mortals.
But hereās the real tragedy: most people havenāt gotten the memo. Theyāre still out there, grinding away on LeetCode, thinking theyāre punching their ticket to Silicon Valley stardom. Sorry to burst your bubble, kids, but the gold rush is over.
Now, donāt get me wrong. We still need engineers. Hell, the world runs on software these days. But hereās the kicker: we donāt need as many as we thought we did. See, it turns out that when you build software that can scale to billions of users, you donāt need to keep hiring engineers at the same rate. Itās almost likeā¦ efficiency is a thing? Whoād have thought it?
So now weāre in this weird twilight zone where the industry is sobering up from its growth binge, realizing itās got a software engineer hangover. All these companies that were hiring anyone who could spell āJavaScriptā are now looking around and going, āWait, why do we have 47 people working on our login page?ā
And let me tell you, itās not just the quantity thatās changing - itās the quality. The bar is getting higher, folks. Itās not enough to be able to center a div anymore (although letās be honest, thatās still black magic to half of us). Now you need to be able to do it while juggling five different frameworks, two cloud platforms, and a partridge in a pear tree.
And for those of you just starting out? Hoo boy. Youāve got a tough road ahead. Sure, thereās a wealth of knowledge out there, but so is the ratio of jobs:applicants, the market has flipped from mostly dominated by young talent to now mostly mids and seniors. Youāll be fighting tooth and nail for fewer jobs against increasingly fierce competition.
As the wise and slightly terrifying Charlie Munger once said:
āI think value investors are going to have a harder time now that thereās so many of them competing for a diminished bunch of opportunities.ā
The cold, hard truth is that future generations of devs are going to have to work harder for less reward. All that āwisdomā weāve been accumulating? Itās either going to become the bare minimum or completely obsolete.
Now, Iām not saying innovation is dead. There will always be new frontiers, new technologies, new problems to solve. Heck, I can think of a dozen ways we could improve developer productivity right now. But the days of easy pickings are over. It took decades to go from the birth of the internet to the iPhone. Who knows how long itāll be before the next big thing comes along?
So, unless youāre the kind of coding savant who makes Linus Torvalds look like a script kiddie, brace yourself. The tech world isnāt going to be handing out golden tickets like itās Willy Wonkaās chocolate factory anymore. Welcome to the new normal, folks. Itās time to level up or get left behind.
Carpe Diem (or Carpe Code-iem?)
So whatās a code monkey to do? Well, let me tell you what Iām doing, and you can decide if you want to come along for the ride.
Proper Fundamentals
First off, forget all that theoretical mumbo-jumbo youāve been cramming into your skull. You canāt claim to have āproper fundamentalsā unless youāve actually built stuff. Real stuff. Messy, broken, frustrating stuff. Theory without practice is like a bicycle without wheels - utterly useless. I just rewatched Oppenheimer and Iāve always loved it whenever he says āTheory can only take you so far.ā
Want a crash course in how computers actually work? Go do nand2tetris. Itāll make you appreciate every single abstraction weāve built over the years. But thatās really optional, whatās not, in my opinion, is grinding Think Python. Itās a book that in my opinion, everyone should go through to learn how to think about programming. And after learning some python, work through some problems from LeetCode with NeetCode. Sure, neetcode basically tech interview porn, but itāll give you a taste of the algorithmic gymnastics employers expect these days.
And now hereās what youāre gonna do: pick a project - any project - and build the damn thing. I donāt care if itās been done a million times before. Hell, start with āHello, World!ā if you have to, but donāt stop there. Keep pushing until you hit that wall where everything falls apart and youāre ready to set your computer on fire. That, my friends, is where the real learning begins.
Now, letās talk about becoming a āstrongā developer. Buckle up, because this is where it gets messy. Thereās no straightforward answer, no magic bullet. Being a good dev means being decent at a whole bunch of different things. But since youāre all looking at me with those puppy dog eyes, hereās my very opinionated take on the top list of dev skills:
-
SQL. Learn it, love it, dream in queries.
-
Algorithms and data structures. Because sometimes, brute force just wonāt cut it.
-
Naming things. Seriously, half of programming is just coming up with good names.
-
Balancing act: Make it fast, make it short, make it readable. Pick two (if youāre lucky).
-
Refactoring. Because your first attempt will always suck.
-
Test-driven development. Write tests, or forever be plagued by bugs.
-
Systems-level voodoo: memory, processes, threads, deadlocks. Itās all fun and games until your app crashes and burns.
-
Security. Please, for the love of all that is holy, donāt store passwords in plain text.
-
Conventions. Follow them, or be forever shunned by your peers.
-
Source control. Git good or get out.
-
Project management. Breaking big problems into small, digestible chunks is an art form.
-
Developer ergonomics. Your wrists will thank you later.
Now, I know what youāre thinking: āThatās too much! Iāll never learn it all!ā Well, man, this field is vast and incomprehensible, and the sooner you accept that, the happier youāll be. Just keep chipping away at it, one day at a time.
Want some benchmarks? Try solving LeetCode 75 problems in 30 minutes each. Sketch out the architecture for YouTube or Twitter without breaking into a cold sweat. Build a simple app in your chosen stack in an hour. If you can do all that, youāreā¦ well, you can impress anyone anywhere.
And for crying out loud, learn how to actually solve problems:
-
Understand the userās perspective. Donāt build solutions to problems that donāt exist.
-
Learn the lingo. If you canāt talk the talk, youāll never walk the walk.
-
KISS: Keep It Simple, Stupid. The fanciest solution is rarely the best one.
-
Know your data. Itās the foundation of everything youāll build.
-
Technical debt is like credit card debt - a little is fine, too much will ruin you.
There you have it, folks. The unvarnished, ugly truth about what it takes to be a halfway decent developer. Itās a long, painful road full of late nights, caffeine overdoses, and existential crises. But hey, at least the pay is still good (for now).
System Design
And from there weāll want to dive deep into the guts of systems. Iām talking distributed systems, concurrency, the nitty-gritty of how computers actually work. Because let me tell you, AI might be able to spit out a React component faster than you can google what to npm install
, but it still struggles with the really gnarly stuff. You know, the kind of problems that make you question your life choices at 3 AM while youāre knee-deep in core dumps.
Sure, GPT and copilot can spit out code faster than you can type it. But understanding the intricate dance of distributed systems, the delicate balance of consistency and availability, the art of designing systems that scale to millions of users? That takes a human touch, my friend.
So hereās my advice: Donāt just study this stuff. Live it. Breathe it. Build things. Break things. Then figure out why they broke and build them again, but better. Set up a Kubernetes cluster in your basement. Write your own distributed key-value store. Implement Paxos (and then cry a little, because letās face it, Paxos is a pain).
And most importantly, think about how all of this fits together. Because thatās where the real magic happens. Understanding how to design systems that are scalable, reliable, and efficient? Thatās the kind of expertise thatāll keep you employed long after AI has taken over writing CRUD apps.
Remember, in the world of distributed systems, eventual consistency isnāt just a concept - itās a way of life. So keep pushing, keep learning, and always design with failure in mind.
Start small, but think big. Iāll give you some project ideas.
Maybe you begin with a simple load balancer. But donāt stop there. Keep adding layers. How does it handle failure? What happens when you introduce caching? How do you ensure data consistency across multiple regions?
And hereās a pro tip: Document everything. Not just your code, but your thought process. Why did you choose this particular architecture? What trade-offs did you make? Because let me tell you, future you will thank past you when youāre trying to debug some weird edge case six months down the line.
Remember, in this game, thereās no such thing as a perfect system. Itās all about trade-offs. CAP theorem isnāt just some academic concept - itās the harsh reality we live with every day. So get comfortable with making those hard choices. Should you optimize for consistency or availability? How much complexity are you willing to introduce for that extra bit of performance?
Build your own distributed key-value store. Start simple with a single-node in-memory store, then add persistence. Now the fun begins - shard it across multiple nodes. How will you handle consistency? Implement vector clocks. Now make it fault-tolerant. Congratulations, youāve just built a mini-Cassandra!
Implement your own consensus algorithm. Start with Raft - itās like Paxos but actually comprehensible by mere mortals. Build a cluster of nodes that can elect a leader and replicate a log. Then throw in some network partitions and see how it handles them.
Build a distributed file system. Start with a basic client-server model, then shard your files across multiple nodes. Implement replication for fault tolerance. How will you handle concurrent writes? Can you implement something like HDFSās write-once, read-many semantics?
Build a toy search engine. Crawl a subset of the web, index it, and implement distributed search. How will you parallelize the crawling? How about distributed indexing? Can you implement something like PageRank? How will you deploy it?
Building these systems from scratch will give you insights that no amount of textbook reading or system design blog post/youtube video would ever could. Youāll understand viscerally why certain design decisions were made in real-world systems, and from the trials of implementing them, can see the failure scenarios that you might not have thought of.
Plus, imagine walking into your next job interview bragging about that scalable distributed search engine you built. Youāll be speaking the same language as the senior engineers, because youāve wrestled with the same problems they have: building and deploying multiple nodes to AWS, processing large scale data sets in the cloud, implementing graph processing practically, probably utilize something like kafka for data pipelines, optimizing throughput as you work on parallelizing your crawling, and more
Hereās a further list of projects that are on my todo
Data Infra
-
ETL Pipeline Generator for Data Warehouse Design: design a tool that automatically proposes an ETL (Extract, Transform, Load) pipeline to convert a normalized operational database into a star schema. Add additional functionalities to maintain the star schema and potentially incorporate other data sources.
-
Workload-Aware Index and View Optimization System: create a system that recommends optimal indices based on specific workloads and database statistics. Alternatively, explore algorithms for determining the most effective materialized views in data warehouse systems like Redshift.
-
LLM-Powered Database Management Assistant: leverage GPT-4 or similar large language models to develop a system that aids in the setup and optimization of PostgreSQL or MySQL databases. Investigate effective prompting strategies, including system state and error message inputs, and explore methods for user interaction through a chat interface.
-
Natural Language to SQL Query Converter: implement a system that translates natural language text or speech into SQL queries for a specified database schema. Focus on a specific subset of queries to ensure feasibility. Evaluate the effectiveness of LLMs for this task, comparing their performance with existing libraries.
-
Generative AI-Database Integration Framework: explore novel database functions (e.g., UDFs, synopses) to facilitate the use of stored data in generative AI applications like GPT-4. Investigate potential SQL extensions to streamline prompt engineering processes.
-
Vector-Embedded Database Search with LLM Integration: research methods for embedding databases using vector representations (e.g., BERT, TaBERT) and combining them with LLMs to enable efficient search capabilities in large-scale database systems.
-
Cross-Database Query Orchestration Agent: develop an LLM-based system to simplify operations across multiple database systems. Investigate the feasibility of using models like GPT-4 to determine data location, route queries appropriately, process cross-database queries, and identify query limitations.
-
Web-based Schema Population Engine: create a tool that, given a user-defined ER schema, automatically searches the web for relevant data to populate it. Utilize APIs like Googleās data search to find and integrate appropriate information into the schema.
-
Approximate Query Processing for Data Warehouses: investigate the use of random sampling techniques to evaluate SQL queries on read-only databases. Analyze potential performance improvements and develop methods to provide approximation guarantees.
-
Automated Data Augmentation System: build a system that automatically identifies and integrates related datasets to enhance existing data. For example, augmenting energy consumption time series data with corresponding temperature data.
-
Distributed System Configuration Auto-Tuner: develop a tool to automatically optimize configuration parameters for distributed systems like Apache Spark or traditional databases (e.g., Postgres, MySQL). Focus on achieving optimal performance for specific query workloads such as TPC-H.
-
Secure Multi-Party Database Query Framework: implement a system using secure multiparty computation techniques to enable multiple parties with partial database views to query the combined dataset without compromising data privacy.
-
Machine Learning-based Query Performance Predictor: collect query execution time data and train a machine learning model (e.g., CNN) to predict the performance of new queries. Alternatively, develop an ML-based system to estimate remaining query execution time in database systems like Postgres.
-
In-Database vs. External Analytics Performance Analyzer: compare the performance of statistical and machine learning operations executed within a database against the same operations performed on exported data using external tools like MATLAB. Reference projects like MADlib for in-database analytics.
-
Optimized Data Parsing Engine: investigate novel approaches to parsing CSV, JSON, and text data into database systems. Draw inspiration from papers such as āFilter Before You Parseā and āMison: A Fast JSON Parser for Data Analyticsā to develop efficient parsing strategies.
Distributed Systems
-
Decentralized Social Media Aggregation Platform: design and implement a distributed, fault-tolerant system similar to Reddit, focusing on decentralized architecture to enhance resilience and user autonomy. Address challenges such as consensus mechanisms, content moderation in a decentralized environment, and efficient data propagation across nodes.
-
Fault-Tolerant Node.js Application Framework: develop a system that enhances the fault tolerance of Node.js applications through replicated execution or other robust distributed systems techniques. Consider implementing features like state replication, automatic failover, and consistency management to ensure continuous operation in the face of node failures or network partitions.
-
Hybrid Big Data and Real-Time Processing Engine: create a data processing system that efficiently handles both batch processing of large datasets (similar to MapReduce and Spark) and real-time, online processing (akin to key-value stores or SQL databases). Focus on designing an architecture that seamlessly integrates these two paradigms, addressing challenges such as data consistency, resource allocation, and query optimization across different processing modes.
-
Unified Real-Time Collaboration Platform: develop a comprehensive platform integrating advanced text, voice, and video communication with real-time collaboration features, AI-enhanced video compression, and seamless integration with productivity tools.
-
Distributed Object-Oriented Network Framework for C++: design and implement a flexible, efficient object system enabling seamless network communication in C++, with features for serialization, deserialization, and remote method invocation.
-
Optimized Log-Structured Distributed Storage System: create an enhanced distributed log-structured storage system addressing Porcupineās limitations, focusing on improved fault tolerance, reduced write amplification, and optimized performance in large-scale deployments.
-
Distributed Protocol Fault Injection and Verification Framework: build a comprehensive tool for systematically testing and verifying distributed protocols under various failure scenarios, ensuring robustness and correctness.
-
Consensus Algorithm Cross-Implementation Validator: develop an infrastructure for testing multiple implementations of consensus algorithms against each other, identifying protocol discrepancies and ensuring consistent behavior.
-
Modern Viewstamped Replication Protocol Implementation: implement an updated version of the viewstamped replication protocol based on the latest research, designed for easy integration into existing distributed systems.
-
Dependency-Aware Distributed Build System: develop an intelligent distributed parallel build tool that infers true dependencies through system call interception, optimizing for correctness and parallel execution.
-
Fault-Tolerant Distributed File System: build a large-scale distributed file storage system incorporating RAID-like redundancy mechanisms for data reliability and scalability.
-
Scalable Storage Virtualization Infrastructure: create a scalable virtual disk system inspired by Petal, leveraging modern tools to provide efficient storage virtualization for large-scale applications.
-
Lightweight Distributed Lock Service: develop a simplified version of Googleās Chubby, focusing on providing a reliable and highly available locking mechanism for distributed systems.
-
Decentralized Object Storage with Consensus-based Metadata Management: deesign a version of MogileFS that eliminates the centralized database by using consensus algorithms for metadata replication.
-
Consistent Hashing-based Distributed Web Cache: build a scalable web caching system using consistent hashing for even distribution of cache entries across multiple nodes.
-
Consensus-driven Highly Available DNS Server: implement a replicated DNS server using consensus algorithms to ensure consistency of updates and reliable domain name resolution.
-
Cross-Node Distributed Systems Debugger: develop a parallel debugger supporting distributed systems by tracking execution flow across message passing and synchronization points.
-
Distributed Systems Performance Analysis Tool: create a profiling tool providing detailed insights into performance bottlenecks in distributed systems.
-
Transparent Service Replication Framework: build a system-call or message-level interposition library for transparent replication of networked services, ensuring fault tolerance.
-
Granular Network Security Middleware: develop a message-level interposition library adding fine-grained security features to existing networked services.
-
Multi-Device File Synchronization Framework: create an efficient and reliable file synchronization tool for multiple devices and platforms.
-
Byzantine Fault-Tolerant Consensus Algorithm: design and implement a version of a consensus algorithm capable of tolerating Byzantine faults.
-
Interactive Consensus Protocol Visualization Tool: build a visualization tool like raftscope for understanding and debugging complex consensus protocol behaviors
-
Cryptographically Secure Distributed Contact Tracing System: develop a privacy-preserving mobile application for contact tracing, leveraging advanced cryptographic techniques to securely track disease spread without compromising individual privacy.
Learn to Code with AI
Iām also getting cozy with our new AI overlords. Not in a āplease donāt replace meā way, but in a āletās tangoā kind of way. Iām learning how to whisper sweet nothings into GPTās ear to make it do my bidding. Prompt engineering isnāt just for the ML crowd anymore, my friends. Itās a survival skill.
Imma Niche
And letās talk about specialization. Remember when being a āfull-stack developerā was all the rage? Well, now itās time to pick your niche and dig in like your career depends on it (because it does). Maybe itās security, or high-performance computing, or wrangling ancient COBOL systems that run our financial infrastructure. Find something that AI canāt easily replicate and become the go-to human for it.
I wanna niche myself in the intersection of infra and AI. Iāve already explained some systems/infra project earlier so now Iāll give you some of my project ideas concerning AI.
Letās code our own GPT. Yeah, thatās right. I want you to create a baby version of those fancy language models everyoneās yakking about. Start small, maybe just predicting the next character in a string. Then work your way up to finishing sentences, paragraphs, and eventually, writing your own stand-up comedy routine. Bonus points if itās actually funny.
Or how about a snarky code reviewer? Create an AI that reviews code like that one colleague we all have. You know, the one who makes you question your life choices with every pull request. Make it sassy, make it smart, and of course, make it catch those off-by-one errors.
Hereās a few more
-
Slack-based Data Visualization Recommender: develop a tool that integrates with Slack to analyze conversations and suggest appropriate visualizations. For example, when team members discuss sales figures for a specific region, the system would propose various visual representations of the sales data.
-
Speech-to-Structured Query Converter: create a system that enables structured querying of spoken text data. This tool would process recorded call center conversations, allowing analysts to pose questions like āWhatās the frequency of inquiries about topic X?ā or āHow many callers are from California?ā The system would use an analyst-defined ER schema to automatically categorize and structure the audio data.
-
Web-based Schema Population Tool: design a tool that, given a user-defined ER schema, automatically searches the web for relevant data to populate it. For example, if a user creates a schema with [temperature, state, date], the system would utilize Googleās data search API to find and populate the schema with appropriate information.
-
LLM-driven Data Science Task Optimizer: leverage large language models like ChatGPT to automate data science tasks efficiently. Focus on minimizing the number of LLM invocations or implementing a cascade of models ranging from local, cost-effective options to more sophisticated, resource-intensive ones.
-
Vector Database Performance Benchmark: investigate the trade-offs between various vector storage and search methodologies. Evaluate the scalability of RAG systems like Pinecone, determine the limitations of exhaustive vector search, and compare different vector stores in terms of performance and accuracy. Consider developing a VectorStore benchmark, referencing the integrations listed in the LlamaIndex documentation.
-
Efficient Raw Data Parsing Framework: explore innovative techniques for parsing CSV, JSON, and text data into databases more efficiently. Draw inspiration from papers like āFilter Before You Parseā and āMison: A Fast JSON Parser for Data Analyticsā to develop novel parsing strategies.
-
Cross-platform Analytics Integration System: create a tool that streamlines analytics across various data storage formats and platforms, including structured CSV data in blob storage (e.g., S3 Parquet format), data warehouses (e.g., Redshift), and relational databases (e.g., PostgreSQL).
-
Machine Learning-based Wikipedia Data Extractor: develop a supervised learning-based system that can extract structured data from Wikipedia articles, focusing on inferring the content of infoboxes from the surrounding text.
-
Computer Vision-based Map Metadata Extractor: build a system that uses computer vision techniques to extract metadata from imagery, such as identifying speed limits from Google Street View images, to enhance digital maps like OpenStreetMap. Utilize existing tools like Google Cloud Vision API for text extraction and object identification.
-
Approximate Join Algorithm for Unmodified Databases: research and implement techniques for approximately executing database joins without modifying the underlying database system. Use the WanderJoin algorithm (as described in the paper from SIGMOD ā16) as a starting point for this investigation.
-
Automated Time Series Forecasting Framework: create a comprehensive system to simplify the time series forecasting process, encompassing data integration, automatic parameter tuning, and user interface design. Consider using Facebookās Prophet library or similar tools as a foundation for this project.
Learn How to Rizz ššŗ
And lastly, the real secret sauce - and I canāt believe Iām sharing this, but weāre all friends here, right? Learn how to talk. Rizz baby. Be a rizzler. I know, I know, itās scary. Weād all rather nest our for-loops than go to a meeting with actual people. But trust me, the ability to translate geek-speak into something the suits upstairs can understand? Thatās gold, baby. Pure gold.
So fire up those IDEs, crack open a fresh energy drink, and letās make some magic happen. Learn the deep stuff, dance with AI, find your niche, and for the love of all that is holy, learn to communicate with non-programmers
The Final Commit
Look, Iām not trying to be all doom and gloom here (although I admitted that Iām FUDing cause Iām a lame writer), but I find that such is the kind of perpsective to give you a kick in the nuts thatāll make you stand up and fight back. The future of software engineering is likely going to be awesome in ways we canāt even imagine. But if youāre like me, still learning the ropes, then we need to act fast.
This is our moment, our last dance in the era where āI know how to codeā is enough to open doors. So letās make it count. Learn, build, connect, and position ourselves to ride this wave of change instead of being swept away by it.
Who knows? Maybe in a few years, weāll be the grizzled veterans, regaling newbies with tales of the wild west days of software engineering. āBack in my day, we had to write our own for-loops, uphill both ways!ā
So fellow code monkeys, shall we dance?