A Computer Upgrade Shut Down Bart
Posted4 months agoActive4 months ago
bart.govTechstoryHigh profile
heatednegative
Debate
80/100
Public TransportationInfrastructureSoftware Upgrade
Key topics
Public Transportation
Infrastructure
Software Upgrade
A computer upgrade caused BART to shut down, sparking frustration and snarky comments about the reliability of public transportation infrastructure in the Bay Area.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
40m
Peak period
144
0-12h
Avg / period
20
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Sep 5, 2025 at 10:52 AM EDT
4 months ago
Step 01 - 02First comment
Sep 5, 2025 at 11:32 AM EDT
40m after posting
Step 02 - 03Peak activity
144 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 10, 2025 at 10:23 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45139270Type: storyLast synced: 11/20/2025, 8:18:36 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Also, what do you mean by trains being local-first? Trains by definition need to share the same tracks with catastrophic consequences for getting it wrong. You can't figure out if a train is going to possibly be on the same route locally, or if your route has been obstructed. Somebody gets a schoolbus stuck on a crossing, it takes over a mile to stop a train.
In the days before systems existed for publishing such schedules and emergency alerts, should public transit service not have been attempted at all?
> Trains by definition need to share the same tracks with catastrophic consequences for getting it wrong.
Just because it uses the same rail gauge as intercity freight doesn't require it to run on the same set of tracks. But if it did, I assume "local-first" entails other traffic just being excluded when an emergency in the local system necessitates it.
Edit, for the pedantic: There's a huge difference between horizontal complexity (i.e. variety of transit options) and vertical complexity (complexity of a particular option). We have less horizontal complexity than we used to; but vertical complexity of a modern railroad is obscene compared to historical standards.
> But if it did, I assume "local-first" entails other traffic just being excluded when an emergency in the local system necessitates it.
No dice; as consider just 14 hours ago:
https://x.com/SFBARTalert/status/1963772853947355630?ref_src...
How does a local-first train safely operate if it could go through a police zone? You need communication, by definition, not local-first.
I think our over reliance on the telecom network has lead to safety issues- mostly in terms of "what to do when the telecom goes down." Because on the whole, its astoundingly reliable.
Wikipedia has a good survey [0].
[0] https://en.wikipedia.org/wiki/Railway_signalling
Of course, centralized signaling is better, allowing for greater efficiency, helps dispatch keep track better track of the trains, makes handling malfunctioning signals a lot safer, among many other benefits. But it doesn't mean local signaling can't be done.
I don’t know, but I would imagine, there’s still a block based setup as a failsafe backup in most or all modern rail systems.
https://www.nytimes.com/interactive/2025/04/20/nyregion/nyc-...
For me, this was the best picture:
https://static01.nytimes.com/newsgraphics/2025-03-10-subway-...
Someone has to stand at that machine 24 hours a day and push and pull levers to keep the trains from whacking one another.
BART has a non-standard rail gauge size that precludes it from interoperability with other rail networks.
https://www.bart.gov/news/articles/2022/news20220708-2
Other ones I'm aware of are Washington DC's metro, and Toronto's subway and streetcars.
We're talking about BART, which uses a track gauge of 5'6" instead of the standard US rail gauge of 4'8.5". They can't run on the same tracks.
(Actually, this is generally true even for those systems that do use 4'8.5" gauge track--I suspect that the standard US freight car envelope doesn't actually fit on most subway systems.)
As a related aside, the Chicago Transit Authority still ran freight on its tracks until not that long ago. Maybe the early 2000's?
It is certainly possible to send a freight train that will fit in most subway tunnels of the right gauge, but you may need a short locomotive and short cars.
(After all, what are the maintenance trains but a form of freight?)
The standard US freight envelope probably counts as Plate C, which is 10'8" wide by 15'6" above the rail. Plate H is the standard for double-stacked containers, which pushes the height to 20'2".
(The part of the loading gauge that I'd be most concerned about is actually the width of the cars at the bottom of the carbody--passenger cars tend to be somewhat narrower than standard boxcar, and given a desire to minimize the platform gap, I'd think there's a decent chance that most freight would strike the platform.)
That said there are other reasons a subway could end up being subject to Federal Railroad Administration[2] rules. I will note that I'm not an expert on those rules. But, generally passenger rail systems in the US are subject to Positive Train Control[3] or equivalent. It appears BART is actually one of the earliest adopters of Automatic Train Control[4], which appears to be a PTC equivalent. If not more automated.
[1] https://en.wikipedia.org/wiki/Loading_gauge
[2] https://en.wikipedia.org/wiki/Federal_Railroad_Administratio...
[3] https://en.wikipedia.org/wiki/Positive_train_control
[4] https://en.wikipedia.org/wiki/Bay_Area_Rapid_Transit#Automat...
Eh? I thought we (TTC, in Toronto) were the only ones making that mistake.
> Just because it uses the same rail gauge as intercity freight doesn't require it to run on the same set of tracks
Building a replica set of tracks that runs parallel to the current tracks just to avoid sharing doesn't strike me as a good use of anyone's time/money.
> "local-first" entails other traffic just being excluded
And how are you going to notify them that they are excluded when the network is down?
The US congressional committee that recommended construction of the railroad was called the "Select Committee on the Pacific Railroad and Telegraph".
So it seems very early it was decided that no, rail transit systems should not be built without communications/publishing infrastructure.
[0] https://en.wikipedia.org/wiki/First_transcontinental_railroa...
Modernization efforts focus on trains broadcasting position and speed so trains can travel closer together and still maintain a safe stopping distance, but that's again possible locally.
Operating switches is where it gets trickier. Some rail operators maintain the possibility to operate them locally, but that requires either stopping the train at each switch you want to change, or to deploy lots of people into the field to do it on schedule
But the point that you can do this local-first is still true. You will want to engage a couple bits of information with the neighboring block, but you don't need to know any global state, and if one block breaks down that only affects its direct neighbors
If air traffic control can fall back to pen and paper in a pinch, I think it would be cool IF trains had a decent fallback. ;-)
Riding without a ticket? Jail.
Littering on the platform? Straight to jail, right away!
Doing any violent crime in NK transit? Believe it or not - death by firing squad.
Here is a quick overview of how the system works: https://youtu.be/eiyfwZVAzGw?si=CnOMa8F6NkiyhifE
Setting aside safety for a moment, consider just hygiene: BART is shockingly dirty. Which suggests mismanagement, above and beyond just a lack of detterence of criminality.
As for safety -- firing squads are probably not in the cards, but would jailing the violent be too much to hope for?
Not saying SF politics is great, but at least point to the correct boogeyman.
https://transparentcalifornia.com/salaries/search/?q=compute...
https://transparentcalifornia.com/salaries/search/?q=compute...
It seems to me that they are over-worked & under-paid and are doing a good job given the circumstances.
NIMBYs have blocked BART in Silicon Valley. BART doesn't reach Menlo Park, Palo Alto, Stanford, Mountain View, Sunnyvale, Los Altos, Santa Clara, or Cupertino. A few years ago, it finally reached San Jose.
A separate train (CalTrain) goes from SF through Silicon Valley. Last year they switched to electric trains which are faster and run more frequently. The SF CalTrain station is inconvenient (20-mins walk from downtown, under a highway), but they are working to extend CalTrain to the central SF station: https://en.wikipedia.org/wiki/Salesforce_Transit_Center#Futu... .
So Silicon Valley transit is getting better, slowly.
edit: lmao, so many upvotes yet my comment has been moved so low. No more snark than a loving brother would provide. TY for your attention to this matter
Probably what's happened is you ended up with a lot of upvotes, but also a lot of downvotes. I would expect HN's software to downrank "controversial" posts, since those are likely to lead to flamewars. So even if you see +30 on your comment, the overall tallies might be something along the lines of +100 -70.
Addition of the last phrase had truly the worst impact; Drumpf meme was not well-timed/placed
"Deploy on every commit" lmao
"Shipping software and running tests should be fast. Super fast. Minutes, tops." hahah
And there are hints to what the author actually means, like "Each deploy should be owned by the developer who made the code changes."
That just isn't feasible in a system that's of any reasonable size.
Who's fault is that?
Asking because I have been the customer with Uncommon Option Q enabled.
You mean to tell me not everyone works on some SaaS product outside of critical path?
For "read only friday" to have been a novel idea in the first place, you needed a starting point where conventional practice already was making changes live without stopping to consider the time/day of week.
I really suspect the detractors represent a workflow that would break (or at least introduce pain) if unable to push to production for a few days. So they have to give the hard sell on the benefits of continuous deployment.
As an engineer I have absolutely no issue deploying on a friday. But friday bar starts at 4pm, and after that I am not sober before monday.
So leadership don't want me to do it - which is probably wise.
I've found that little things like that breed a growing resentment and stress that compounds, until someone wants to leave the company. Thursday night outage that I have to hop on? Much smaller deal than a weekend where I have established plans.
One can argue "why was the PR approved in the first place", but sometimes people make mistakes. It especially sucks when there are limited people that know how to troubleshoot and resolve the production issues with a system, even more so when the on-call individual may have not even reviewed the code initially.
All that said - I'd love to deploy as normal on Fridays! I've just found that the type of businesses I've worked at can wait until Monday, and that makes our weekends less risky.
The idea the author seems to be advocating for is is that, while maybe you sometimes/often shouldn't deploy on a Friday (or even not at the very end of any workday), there should never be a stated policy in place that freezes deployments.
And yeah, I've been at places where they have freezes on weekends, holidays, right around the company's conference, etc. But they're never 100% freezes: if something goes wrong or is necessary, you just get a manager to approve it, and off you go.
I think the author's exhortation that developers should all be able to exercise their judgment to make these calls is a nice idea in theory, but falls flat in practice. Every developer will not always have all the necessary context in order to exercise that judgment. Even those who do, and generally have good judgment, will screw up sometimes because they are tired or are working under some sort of time pressure, or something.
Having a policy -- with some flexibility and exceptions allowed -- makes it easier to avoid those sorts of lapses in judgment. And that's a good thing.
But the whole article is just all over the place to me. The author starts by implying that people should be "ashamed" about identifying with a no-Friday-deploy policy, but then softens to the point of saying it's fine to have a personal policy of no late-afternoon deploys, no shipping big changes right before the weekend, etc. But that somehow if that's instead company policy, that's a bad thing. Nope, I don't buy it.
Public transit wasn't as gross as stories I've heard of elsewhere, but it also wasn't something I wanted to take on a regular basis if I could help it. I think I used it regularly for about six months or so one year in particular, and the lack of warm bus stops meant standing in freezing rain, sleet, snow and more.
Maybe things have improved since I lived there, but hearing that they are the high bar is pretty sad.
> it was still notably slower than driving, so many if not most people still drove everywhere
I'd argue that those folks are missing the point. Sure, when I was commuting by Minneapolis public transit, it was slower than driving. But you know what I wasn't doing while I was on the bus/train? Driving! I was reading, writing, daydreaming, sleeping, any number of activities more pleasant than sitting on I-94.
Standing out there in the winter could be brutal, I'll admit. Then again, the light rail stops were heated, and the park & ride I transferred at in Plymouth had a nice climate-controlled lobby. The only time I was really out there was standing in the driveway in front of my office, waiting for the shuttle to pick me up.
Twin Cities public transit is a damn sight better than what we have in Milwaukee, that's for sure. Low bar, but the Twin Cities clear it handily.
Sadly, my neighborhood had long waits between buses that connected to university ave, and neither my neighborhood or university ave had heated stops. So, odds were pretty good that I'd suffer the weather for 20-30 minutes each trip.
I also tend to get motion sickness if I read or use a laptop in cars or busses, so there really wasn't anything I could do on them that I couldn't do by driving anyway.
It is a microcosm, a bit of a litmus test, and an ideological battlefield of the embattled sides. But this example specifically is also a kind of infighting, of the more anarcho-libertarian tech camp that enjoys highlighting and dripping with self-righteousness about any tech related failure of government, i.e., or at least government that does not align with their ideology or control over it.
This fault line of America runs right through things like BART like an effigy or idol that America performs a kind of ritual form of battle on as proxies for all out civil war. Think of tribes you may have seen videos of where they do all kinds of elaborate dances and blustering displays and fake charges to demonstrate their power.
The glee about this outage happening to BART is very much because the libertarian tech progressive types are amused and validated by it, where something more like rashes of violent attacks on BART riders by menaces to society might be something that the "heartland" may become gleeful about, as evidence for how the ideology of SF is messed up. In the cases of violent attacks on BART riders, another camp/tribe would come out and demonstrate their fierceness; the "socially liberal" types from all over the country and even world, would rush to the defense of their ideological idols with a bewildering storm of rationalization, delusion, and excuse making for violent attackers and in defense of their ideology/cult.
It's just elaborate war dances around an idol/ideology to demonstrate how fierce and powerful each party is. BART is just one of the idols in America around which these displays of simulated conflicts happen.
Originally, BART was a master stroke of digital integration in the 70's, and it's digital voices announcing the next trains were a thing of the future: An early accessibility feature before we even knew what those were, really.
Reading:
https://www.bart.gov/about/history
https://www.bart.gov/about/projects/traincontrol#:~:text=To%...
The larger engineering lesson from that is you're probably better off making standard solutions work for your situation than custom solutions. The wider gauge solved(?) the stability problem, but at the cost of always needing custom rolling stock, but more importantly, making Bart build-out significantly more expensive and unable to take advantage of existing track. That hurts the viability of the Bart ecosystem.
Why not use Indian rolling stock? Modern Indian metro trains are quieter and more comfortable than BART.
(It is the widest gauge in use of heavy-duty mainline railways in the world, but that's a separate thing).
For comparison to other US rapid transit systems (almost all others use 1,435mm/4'8.5"), see table in https://en.wikipedia.org/wiki/Rapid_transit_track_gauge and scroll down to "United States". Or else https://en.wikipedia.org/wiki/List_of_United_States_rapid_tr... which does not list gauges.
[0]: https://en.wikipedia.org/wiki/Track_gauge_in_the_United_Stat...
Cars are anti-fragile and decentralized.
Cars fail open in the short term.
The real heros? The bus drivers. The baddies? The planners, the management. The evil? Pure unadulterated evil? The AC Transit app. I would give it a -11.
It was state of the art on 1962 when it was designed, and remained state of the art until the 1980s, when the signal system started breaking down, and the the late 80s upgrade which had a train presence glitch, which caused almost all the system to get resignaled.
So by the 2000s again it's showing its age, and they got a 32 processor zSeries mainframe.
Brake problem last week, and the this on Friday? Now it's getting like New York, even more. Whatsmatteryou?
That and IOPS are the primary advantage of mainframe systems.
It's quite possible the system will collapse next year if we don't pass increased taxes to fund it in 2026 https://www.bart.gov/about/financials/crisis.
Just last year we failed to pass a common sense bill to make it so we only need a 51% majority for transit bills in the future, indicative of how opposed we still are to transit in the Bay Area https://www.cbsnews.com/sanfrancisco/news/california-proposi....
Not to mention the fact that Silicon Valley opted out of BART and chose car dependent sprawl instead.
So let's be clear, most of the issues with BART are due to anti-transit and suburban voters starving it of support.
Just to compare with another expensive city - BART serves 1/20th of London's Tube rides while operating on 1/5th of the Tube's budget.
Costs are an America issue, not a BART issue: https://transitcosts.com/new-data/
BART is one of the most cost efficient systems in the US: https://www.reddit.com/r/transit/comments/1d27dvo/us_cost_pe.... It's so efficient that pre-pandemic it got the majority of its funding through fares, not taxes.
By the way it costs exorbitant amounts to build highways too and you don't see people criticizing all of our highways around the area do you.
So quite frankly you don't know what you're talking about.
> Costs are an America issue, not a BART issue: https://transitcosts.com/new-data
If by "America" you mean NYC/SFBA then sure. You can see in your own link there's massive spread across the locales with some being cheaper than UK per km
> you don't see people criticizing all of our highways around the area do you
uhm what?
> If by "America" you mean NYC/SFBA then sure. You can see in your own link there's massive spread across the locales with some being cheaper than UK per km
What you're talking about in that link is the extension to San Jose, not day to day BART operations. That one does deserve criticism as they've made poor decisions like not doing cut/cover because NIMBYs in San Jose don't want any disruption to streets. So instead we are tunneling to the Earth first. Elsewhere in the world municipalities understand that it's worth temporary disruptions to roads to bring down costs, but of course America is unique and we have to learn these lessons ourselves.
It seems to me that BART management did what most of other government bureaucracies did around here during covid - threw their feet on the desk and took an extended 2+ year sabbatical
As a former tube-commuter and occasional BART-user, I'd wager that possibly a majority of the commuting trips in zone 1 are taking people from a mainline train station to somewhere, and then back in the evening. That option barely even exists in the Bay Area - indeed every time I look at how to use Caltrain from SFO I give up and rent a car instead.
Why? Last I checked, it's
Is there some complication I'm missing (other than the fact that neither BART nor Caltrain are 24/7 services)?Fortunately they've since reverted back to always running to Millbrae from the airport.
Once you actually get to Millbrae you then get to deal with BART's whole NIH problem manifesting as a refusal (up until recently) to offer timed connections with Caltrain. And, of course, up until 2021 actually getting between the BART and Caltrain platforms involved a ton of walking.
I would think increased ridership means increased efficiency.
The problem of course then is that you create a whole in the bucket. Fewer trains -> BART becomes less convenient -> people choose other options -> lower ridership -> fewer trains -> less convenient ....
Jokes aside, I'd like to see a stack ranking of US public transit. I'd assume NYC and DC are top dogs, but I'm curious about other cities.
NYC is definitely the top dog. There was a recent ranking for metro areas ranked by walkability, bike-ability, public transit, and some other urban score, but divided by average rent price for a 1BR apartment. NYC still came out #1 despite the rather large denominator.
It even has direct service from two metro lines to the airport.
Didn't bigTech start buses going directly to their campus as a perk?
Everyone wants more services and lower taxes, but they vote for the lower taxes and get made when there are no services. Those things often don't go together. It's okay to either accept fewer services with less tax burden, or higher taxes with more services (the side I generally lean towards, within reason).
203 more comments available on Hacker News