So I built something over the course of a few hours. It's called Tangled Sync, and whilst it doesn't seem to function properly yet, I'm trying my best. This is one of those posts where I'm documenting both what I've made and where I'm stuck, because sometimes writing about a problem helps clarify it—and sometimes you just need to rant into the void about why simple things are never actually simple.
What Even Is This?
Tangled Sync is meant to automate syncing GitHub repositories to Tangled whilst publishing ATProto records for each repository. The idea came from wanting my GitHub projects mirrored on Tangled—a decentralised alternative hosting layer—whilst maintaining structured metadata through AT Protocol.
In theory, it's straightforward: clone repos from GitHub, push them to Tangled, update READMEs with mirror links, and create ATProto records so everything stays discoverable. In practice, I've discovered that "straightforward" is doing a lot of heavy lifting in that sentence, and I'm increasingly convinced that the universe has a personal vendetta against anyone who thinks automation should be easy.
The Setup (Which Should Be The Easy Part)
The configuration lives in src/.env, where you specify:
BASE_DIR– where GitHub repos get cloned locallyGITHUB_USER– your GitHub username or organisationATPROTO_DID– your ATProto DIDBLUESKY_PDS– PDS instance URLBLUESKY_USERNAMEandBLUESKY_PASSWORD– authentication credentials
Standard stuff, really. Nothing particularly clever here, just the baseline configuration you'd expect for something that needs to talk to multiple services. This part actually works, which is honestly shocking given how the rest of this has gone.
What It's Supposed To Do (In A Perfect World Where Everything Works)
When you run npm run sync, the script should:
Authenticate with Bluesky using your credentials (spoiler: this works)
Clone all repositories from your GitHub account locally, excluding the repo that matches your username to avoid recursion (also works, surprisingly)
Add a
tangledremote to each repository if it doesn't exist (here's where things start getting interesting)Push the
mainbranch to Tangled (oh, you sweet summer child, thinking this would just work)Update each README with a link to the Tangled mirror (theoretically fine, practically has edge cases)
Create ATProto records under
sh.tangled.repowith metadata about each repository (works but feels fragile)
The ATProto records use TIDs (time-based, sortable identifiers) to ensure uniqueness, and the whole thing is meant to be idempotent—running it multiple times shouldn't break anything. Emphasis on "shouldn't," because software development is the art of discovering all the ways your assumptions are wrong.
Where It's Going Wrong (Oh Boy, Where Do I Start)
Here's where I'm stuck, and why I'm writing this rather than celebrating a working tool.
SSH Authentication Headaches (Or: Why Can't Things Just Work?)
The Tangled remote creation requires SSH keys. Fair enough. But the script doesn't handle cases where:
SSH keys aren't properly configured (because of course they aren't)
The Tangled remote doesn't exist yet (it tries to create one, but that fails if you don't have permissions)
Network issues interrupt the push (because the internet is a beautiful, reliable place that never has problems)
When it fails, it logs a warning and moves on, which is better than crashing entirely, but it means repositories can end up in inconsistent states. Some have Tangled remotes that don't work, others are half-pushed, and debugging which is which requires manually checking each one. You know what's fun? Manually checking thirty repositories to figure out which ones are broken. That's a great use of time. Really.
The error messages aren't particularly helpful either. "Could not push to Tangled" could mean anything from "your SSH key is wrong" to "the repository doesn't exist" to "Mercury is in retrograde and git has decided today's a good day to be difficult." It's like trying to debug something whilst blindfolded and being told "something's wrong, good luck."
The README Update Dance (A Comedy of Edge Cases)
The README update logic checks for tangled.org in the file content before appending the mirror link. Sounds sensible, except:
It doesn't handle READMEs with unusual casing (
readme.txt,README.MD, etc.) consistently—because apparently standardising onREADME.mdwas too much to ask forIf the script fails after updating the README but before pushing to Tangled, you've got a dead link sitting there mocking you
Multiple runs can theoretically append multiple mirror links if something goes wrong (though I've tried to prevent this, emphasis on tried)
And here's the kicker: if you have a repository without a README (which, let's be honest, we've all done at some point), the script just... moves on. No error, no warning, just a quiet acknowledgment that this particular repository will never have its Tangled mirror documented. Is that ideal? No. Do I have the energy to fix it right now? Also no.
ATProto Record Creation Confusion (Or: When "Unique Identifiers" Feel Less Than Unique)
The record creation checks for existing records by iterating through all entries in the sh.tangled.repo collection and comparing names. This works, but:
It's inefficient if you have hundreds of repos (which I don't yet, but what if I did?)
The TID generation uses a random clock ID, which feels fragile in that "it'll probably be fine until it isn't" way
Error handling for duplicate records isn't as robust as it should be (sensing a theme here?)
I've also discovered that the ATProto SDK can be... particular about how you structure requests. Not in a "here's a clear error message telling you what's wrong" way, but in a "something failed, figure it out yourself" way. It's like the SDK is playing hard to get, except instead of being charming it's just frustrating.
The worst part is that I'm not entirely confident I've got all the edge cases covered. What happens if two repositories have the same name? What if the ATProto service is temporarily down? What if I sneeze whilst the script is running and somehow that causes a cascade failure? (Probably not that last one, but at this point I wouldn't be surprised.)
What Actually Works (Small Victories In A Sea of Chaos)
Despite the issues, some bits do function, which is honestly more than I expected at this point:
Repository Cloning: This part is solid. git clone works reliably, and the script correctly skips repositories that already exist locally. One less thing to worry about, thank god.
Idempotency (Mostly): Running the script multiple times generally doesn't create duplicates or break existing setups. The remote checks prevent adding multiple tangled remotes, and the record cache helps avoid duplicate ATProto entries. "Generally" is doing some heavy lifting in that sentence, but I'll take what I can get.
README Links: When everything works—and I emphasise when—the README updates are clean and preserve existing formatting. The link format is consistent: Mirrored on Tangled: https://tangled.org/${ATPROTO_DID}/${repoName}. It's the one part of this project that doesn't make me want to throw my computer out the window.
The Codebase (Because Apparently I Enjoy Pain)
It's TypeScript, using:
@atproto/apifor interacting with Blueskydotenvfor configurationNode's
child_process.execSyncfor git operations (because nothing says "robust error handling" like synchronous shell commands)Standard filesystem operations for README manipulation
The TID generation is particularly interesting (or concerning, depending on your perspective). I'm using base32-sortable encoding to create identifiers that are both time-ordered and collision-resistant:
function generateTid(): string {
const nowMicroseconds = BigInt(Date.now()) * 1000n;
const clockId = generateClockId();
const tidBigInt = (nowMicroseconds << 10n) | BigInt(clockId);
return toBase32Sortable(tidBigInt);
}This shifts the current timestamp left by 10 bits and combines it with a random clock ID to ensure uniqueness. In theory. Whether it holds up under load is another question entirely, and one I'm slightly terrified to answer. The TID format itself is solid—it's based on Twitter's Snowflake IDs—but my implementation might have issues I haven't discovered yet. Fun!
Why Share Something Half-Broken? (Good Question, Thanks For Asking)
Good question. Partly because writing about it helps me think through the problems more clearly. Partly because I'm hoping someone reading this might spot something obvious I've missed (please, if you see something, say something). Mostly because I think there's value in showing work-in-progress rather than only polished, finished projects.
But also? I'm frustrated. I'm frustrated that something that should be straightforward—sync repositories, create records, update documentation—turns into this maze of edge cases and failure modes. I'm frustrated that error messages are universally terrible. I'm frustrated that every tool I'm using has its own quirks and assumptions that don't quite align with what I'm trying to do.
The AT Protocol ecosystem needs more tools like this—automations that make decentralised workflows easier. Even if Tangled Sync doesn't work perfectly yet (understatement of the century), the concept is sound. GitHub as the source of truth, Tangled as the decentralised mirror, ATProto as the metadata layer. That's a reasonable architecture for maintaining code across multiple platforms.
So why is it so hard to make it actually work?
What I'm Trying Next (Because I Apparently Can't Leave Well Enough Alone)
A few things to tackle, assuming I don't give up and go live in the woods:
Better Error Handling: Rather than just logging warnings when Tangled pushes fail, I need to actually handle recovery. Maybe retry logic, maybe prompts to fix SSH config, maybe just failing more gracefully with actionable error messages that don't make me want to scream.
SSH Verification Step: Adding a preliminary check that verifies SSH connectivity to Tangled before attempting any operations. If it fails, stop immediately rather than limping through half the process and leaving everything in an inconsistent state. Revolutionary concept, I know.
Record Deduplication: Improving the ATProto record checking to be more efficient and reliable. Possibly switching to a different identifier strategy that doesn't rely on random clock IDs and the hope that probability will save me from collisions.
Testing With Edge Cases: Running this against repositories with unusual names, no READMEs, broken git configs, etc. Finding where it breaks (everywhere) and fixing those failure modes. This is going to be fun. By which I mean absolutely miserable.
Actually Writing Documentation: Because right now, anyone else trying to use this would have to read through the code and hope for the best. That's not sustainable, even if reading uncommented TypeScript builds character or whatever.
The Broader Picture (Or: Why I'm Bothering With This At All)
What interests me about this project isn't just syncing repositories—it's the intersection of different decentralised systems working together. GitHub is centralised but ubiquitous. Tangled provides decentralised hosting. AT Protocol adds discoverability and metadata.
Getting these three systems to cooperate smoothly feels like a microcosm of the broader challenges in building decentralised infrastructure. Each piece works fine individually, but the connections between them are where complexity (and failure modes) accumulate. It's like trying to get three people who speak different languages to plan a dinner party—technically possible, but requires so much translation and coordination that you wonder if it's worth the effort.
And yet, I keep coming back to it. Because the vision of truly decentralised, interoperable systems is compelling enough to push through the frustration. When it works—those rare, beautiful moments when everything clicks into place—it feels like glimpsing the future of how the web could work. Then something breaks again and I remember why we still have centralised platforms.
Where You Come In (Please, I Need Help)
If you've made it this far and have thoughts, I'd genuinely appreciate them. The repository is up at tangled.org/did:plc:ofrbh253gwicbkc5nktqepol/tangled-sync, and whilst I haven't written much documentation yet (because, well, it doesn't fully work), the code is readable enough to figure out what's happening.
Particular areas where I could use insight:
SSH remote creation strategies that don't require pre-configured keys (is this even possible?)
Better patterns for idempotent git operations (because apparently I don't understand git as well as I thought)
ATProto record management that scales better (and doesn't feel like it's held together with string and hope)
Error recovery approaches that aren't just "log and continue" (revolutionary, I know)
Seriously, if you've solved any of these problems before, please let me know. I'll be eternally grateful. I'll write you a thank-you note. I'll name my firstborn after you. Whatever it takes.
Reflections (Or: What Have I Learned From This Mess?)
Building Tangled Sync has been one of those experiences where the gap between "this should be straightforward" and "why isn't this working" becomes painfully apparent. The individual components aren't complicated, but orchestrating them reliably is harder than I expected. Much harder. Like, exponentially harder.
I'm also learning (again, because apparently I need to learn this lesson multiple times) that automation is only as robust as your error handling. The happy path works fine. It's all the edge cases—network hiccups, missing permissions, unexpected file structures, cosmic rays flipping bits in memory—that turn a "few hours of work" into an ongoing debugging saga.
Still, there's something satisfying about building tools that connect different platforms, even when they're temperamental. The vision of having my GitHub projects automatically mirrored on Tangled with proper ATProto metadata is compelling enough to keep tinkering. Even when it makes me want to give up and become a shepherd or something equally unrelated to software development.
Next Steps (Or: The Sisyphean Task Continues)
For now, I'm going to keep chipping away at the error handling and SSH authentication issues. Maybe add some better logging so I can actually see what's failing and where. Possibly restructure the ATProto record creation to be less... optimistic about things working on the first try.
And if I can't get it working reliably? Well, I'll document what I learned and why it didn't pan out. Sometimes the most valuable projects are the ones that teach you what not to do next time. Though honestly, I'd prefer a project that just worked.
The frustrating thing is that I can see how this should work. The pieces are all there. They just don't want to fit together properly, like trying to assemble IKEA furniture with instructions written in a language you don't speak. You know it's supposed to be possible, you can see other people have done it, but you're left sitting on the floor surrounded by parts wondering where you went wrong.
If you're interested in following along with this mess of a project, check out the repository at tangled-sync. Fair warning: it's experimental, occasionally breaks, and definitely doesn't have proper documentation yet. But hey, that's half the fun of working in the decentralised space, isn't it? Building things that probably shouldn't work, and being pleasantly surprised when they occasionally do.
And if they don't work? Well, at least you've got a blog post to show for it.