As many of you noticed, AnandTech has spent several hours offline today. We are still in recovery mode at the moment (as I write this, the site has been restored to a copy from November 25th), but now that our major restoration efforts are completed, I wanted to offer you guys a brief update on the status of AnandTech.

At around 13:00 UTC (5am PT) today, the on-site cloud storage for AnandTech’s hosting provider became corrupted. As a result, AnandTech (and some other sites) were brought offline. Due to the nature of the corruption and the need to begin restoration efforts ASAP, we opted to restore the site from an off-site cold storage backup, rather than trusting the questionable on-site storage.

This is the first time we’ve ever had to execute our off-site data recovery plan before. And while it meant AT took a bit longer to restore than would be ideal, ultimately everything worked out and proved the necessity for off-site backups.

We’re still working to restore content from the last few days. Articles will be back, but we’ve likely lost any comments and user account registrations/updates made since midday Friday. Sorry about that! And thank you for bearing with us during today's outage.

Comments Locked

31 Comments

View All Comments

  • iq100 - Friday, December 2, 2022 - link

    Geoffrey A wrote:
    "A design that that has no single point of failure.
    Or state there is no such design."

    I think it's fair to say the latter wins the prize.
    ---
    Tandem Computers, long ago, had such a design. Not even expensive with today's inexpensive servers.

    Tandem's NonStop systems use a number of independent identical processors and redundant storage devices and controllers to provide automatic high-speed "failover" in the case of a hardware or software failure. To contain the scope of failures and of corrupted data, these multi-computer systems have no shared central components, not even main memory. Conventional multi-computer systems all use shared memories and work directly on shared data objects. Instead, NonStop processors cooperate by exchanging messages across a reliable fabric, and software takes periodic snapshots for possible rollback of program memory state.

    reference: https://en.wikipedia.org/wiki/Tandem_Computers
    "Tandem's NonStop systems use a number of independent identical processors and redundant storage devices and controllers to provide automatic high-speed "failover" in the case of a hardware or software failure. To contain the scope of failures and of corrupted data, these multi-computer systems have no shared central components, not even main memory. Conventional multi-computer systems all use shared memories and work directly on shared data objects. Instead, NonStop processors cooperate by exchanging messages across a reliable fabric, and software takes periodic snapshots for possible rollback of program memory state."

    What does Anandtech, and everyone else think? Is it possible? Write up the design, here.
  • GeoffreyA - Saturday, December 3, 2022 - link

    I think it's a brilliant design, this extreme redundancy in the spirit of distribution.

    (I get the feeling that even the universe's "data structures" keep track of things in a distributed fashion. When reading current theories, one gets the impression that nothing is global, but the consistent state is built up piece by piece. Perhaps the key is message transfer, rather than storage in some "big table!")
  • The Von Matrices - Friday, December 2, 2022 - link

    If the only data loss is a few of the most recent comments, I would call that a success.

    There is always a tradeoff of what you are willing to lose vs. how much you are willing to pay to avoid loss. You certainly can design a system that cannot suffer a data loss event, especially on a news site where there isn't much much data being generated, but whether the company can afford such a system is another issue. News, especially online news, is an extremely low profit business.
  • Dug - Tuesday, December 6, 2022 - link

    Why? Are you paying their salaries? It's just a website that went down and came back up, it's not a big deal. The "hardware/software design that cannot suffer a data loss. A design that that has no single point of failure" does not exist.
  • The Von Matrices - Friday, December 2, 2022 - link

    When I saw the number of comments the "Best CPUs" article decrease, I thought it was due to the release of the long awaited comment editor. Alas, it was only data corruption. We can only dream...
  • Threska - Saturday, December 3, 2022 - link

    Well that "sponsored post" article lost a lot of comments.
  • ballsystemlord - Sunday, December 4, 2022 - link

    Hopefully only the shill posts (ha ha).
  • fervloka - Saturday, December 3, 2022 - link

    All I want to know is why Anandtech is apparently unaware that AMD launched Genoa.
  • DigitalFreak - Saturday, December 3, 2022 - link

    With the frequency articles are posted, was anything really lost?
  • supdawgwtfd - Sunday, December 4, 2022 - link

    +500

Log in

Don't have an account? Sign up now