Anyone else look at the headline and feel this is one of the dumbest headlines ever. It makes it sound like NASA's incompetent. 'Why haven't you fixed Hubble yet?'.
There doesn't seem to be any nuance or respect that they're trying to repair an orbiting telescope that was launched 30 years ago and designed 40 years ago - and that people are patiently trying to sort through a fully autonomous system 400 miles above the surface of the Earth with a very large set of failure options.
For me - huge props to NASA and other organizations that do this kind of work and keep these systems running for decades. I need to reboot Windows every 2-3 days
Yeah, I hear you on that. Another possibility is that NPR is trying to manufacture a sense of mystery or surprise as journalists often do with science stories. A bit less nefarious but also sort of annoying in its own way.
Completely not related to the topic at hand but ;)
I love this "spend hours figuring out the API" then giving up.
Maybe it's just me but I have noticed that a lot (and it mean a lot) of devs will overestimate greatly on anything manual they need to do that they don't like. And spend hours if not days trying to automate it. Which is fine if it's gonna be needed again soon or over and over. But really, what is so bad in spending literally 5 minutes doing the above manually with a Jira filter and bulk edit? And by extension sometimes there's not even a bulk edit and you need to do something by clicking the same 5 steps 50 times to acgieve something. Again 5 minutes of actual work. Just put on a nice fast song from whatever music genre you happen to love and do it. Done.
I discovered this in my PhD. I spent 3 months trying to automate a process (running a finite element model atop very fragile software platform). I couldn't figure out how to automate my way around the errors the system would give me, every run was unpredictable in duration due to the errors, and I ended up spending a lot of time holding the system's hand.
Eventually gave up on the automation and resolved to spend a few weeks clicking every 15 minutes to complete the simulations. It was very hard to stay motivated, but I got two papers out of it.
The disadvantage is now my CV says I'm an expert at an FEM framework which I never want anything to do with ever again.
Normally if your eyes and hands are synchronized to do an operation, it can be automated.
Sometimes problems come from trying to automate at the lowest level, when maybe screenshotting a vm and sending a click at a specific position if the pixel colors change to the error condition could be done a bit faster.
Journalists don't choose the clickbait headlines and generally don't care for them; they are business decisions, especially for any site that makes money via ad clicks. They even do A/B testing: initially some visitors are shown headine A, others see headline B. If B gets more clicks, A is dropped.
I find a lot of content on YT is like that. The actual work is skipped hand-waved then it's every other second cut scene and some music on top... Idk. Started unsubscribing from channels lately. Still some good ones.
I'm not sure how much more mystery one neds than 'Only orbiting telescope in human history has a fault our best engineers haven't diagnosed as yet!' :-)
> A NASA history of the Hubble,[24] in discussing the reasons for switching from a 3-meter main mirror to a 2.4-meter design, states: "In addition, changing to a 2.4-meter mirror would lessen fabrication costs by using manufacturing technologies developed for military spy satellites."
What's mindboggling to me is the sheer number of spy satellites that were more or less each equivalent to Hubble.
There are estimated to have been something like seventeen of the KH-11 satellites, each one of which cost something like 2/3s of a Nimitz class aircraft carrier.
Hubble was botched initially, they had to send astronauts up to fix it before they could get usable images out of it. I wonder what the success rate for those spy satellites was. Maybe some of them were replacements for others, replaced instead of repaired?
> Anyone else look at the headline and feel this is one of the dumbest headlines ever. It makes it sound like NASA's incompetent. 'Why haven't you fixed Hubble yet?'.
I didn't get that impression.
This does intrigue me - I like browsing hackernews, but I often get the impression that some people (not the PC specifically) here are either ridiculously anal about English or genuinely do not parse sentences the way I do.
> I often get the impression that some people (not the PC specifically) here are either ridiculously anal about English or genuinely do not parse sentences the way I do.
People with autistic traits sometimes parse sentences in an overly literal or precise manner. I rarely do this anymore, but when I was a child and teenager I did it more often.
When I'm speaking of "autistic traits", I'm not speaking just of people diagnosed with autism/ASD (who are of course represented here), but also people with broad autism phenotype (BAP), the subclinical manifestation of ASD. BAP is when you have more of the symptoms of ASD than the average person does, but not enough to justify an actual diagnosis of ASD. BAP is quite common in software engineers, and STEM professionals more generally, so I think there are likely a lot of people on this site with BAP (albeit most of them have probably never heard of it.) The people you are talking about quite possibly do have some degree of BAP, and this behaviour is quite possibly a manifestation of their BAP.
But the headline is literally and precisely true, at least at present. At some point they may be able to figure out the issues enough to restore operation, but not yet.
Sometimes, how this trait manifests itself, is in a difficulty in seeing that a sentence could have multiple meanings; the mind focuses on one particular meaning and struggles to perceive the other possibilities. Usually there are several different ways to read something in an overly literal/precise manner; the fact that one of them is true doesn't do much good if one's mind has decided to fixate on one of the others that isn't.
They use “can’t”, meaning “can not”. Unable. That’s misleading, and no doubt intentional, to create more drama. The more accurate word would be “haven’t”, or “haven’t yet”, but that’s not alarming enough.
Can’t work out gives a sense that they have tried everything and failed. “Haven’t worked out yet” is still factual and implies that they are still working on it.
> There doesn't seem to be any nuance or respect that they're trying to repair an orbiting telescope that was launched 30 years ago and designed 40 years ago - and that people are patiently trying to sort through a fully autonomous system 400 miles above the surface of the Earth with a very large set of failure options.
That's what the article says. The headline is clickbait.
Still? It got exponentially worse last few years. I deliberately turn off unplanned auto-updates by any means because windows 10 reboot/shutdown (when I really need one – drivers, etc) takes minutes because of mandatory updates. When I still had windows laptop and had to bring it somewhere I hibernated it instead of turning off, because turn off would mean “please wait for 20 minutes of battery stress test” or few hours if it’s that lucky day of not-a-service-pack-4.
Saying “can’t figure out” gives me the impression that they’ve tried their hardest and have concluded they are unable to diagnose or solve the problem, which is not accurate.
pushes up glasses I would watch the heck out of a Twitch stream of their debugging/brainstorming sessions. I always loved the movie Apollo 13, especially the technical troubleshooting parts.
Henry S F Cooper Jr.'s book The Evening Star describes some of the remote debugging and other problem solving that was necessary when the Magellan probe experienced computer problems while orbiting Venus. It's been a few decades since I read it, but it was pretty detailed and rather exciting.
Occasionally, very occasionally, it’s nice to read somebody just expressing enthusiasm instead of just posting a clever counter argument. It’s like a spice that you only want a little of but that’s still nice
Eh, I can understand both sides of the fence. 'this is why I read HN' is nothing but a slightly more verbal '+1', but as you stated it does humanize HN and makes it feel more social.
In terms of downvotes, what really irks me and what I often see is people posting factually correct information, but still being sent into faded oblivion because some sect of the community's worldview doesn't agree with the facts.
I don't understand this viewpoint. Information being factually correct is a low bar. I have a lot of factually correct information that is irrelevant or misleading, or that I could state in a way that drags down the level of discourse more than it illuminates truth. Factually correct information is usually involved in tu quoque fallacies, or used to goad people into drawing false, non-sequitur conclusions. The Hacker News guidelines lay out a list of expectations for comments that go beyond factual correctness.
If someone uses factually correct information to make a comment thread worse, I can see how downvotes could be justified.
Since comment scores was removed this is the only way to signal this to others besides the original commenter.
That said it should not be overused. If it annoys someone I guess they should downvote it but I don't think there is a need to reflexively downvote every time someone adds a friendly meta comment.
(And if people start gaming it for karma farming I guess it should be downvoted relentlessly until that stops :-)
It didn't necessarily bring anything to this one conversation. It did, however, communicate that "this is the type of information that that person finds valuable on Hacker News". And knowing what other people in your social group like to hear/discuss is an important part of keeping that group vibrant and wonderful.
So no, it's probably not as useful as the comment it was referring too, but it was useful (to some of us) as it pertains to the community as a whole.
It's arguably even worse than just commenting "This." At least that is small enough you can scan over it and barely even register its existence. But this fedora-tipping "Thank you kind sir this is the type of Internet Content I enjoy!" doesn't even afford you that luxury.
I really dislike the downvote function because it reinforces self-censoring.
And I completely loathe the implementation of it, you need xxxx upvotes to downvote posts....I have no words.
Well, in opposing it I especially read the faded comments and upvote any of those that are not completely abhorrent.
This seems to happen a lot more frequently here than anywhere else.
I'm not really sure what that says, other than people still read comments that are faded. Also that people shouldn't worry about self-censoring.
I don't have a problem with downvotes or the karma needed to do it, but
I do sometimes wish it were possible to reply to a dead comment, especially if you vouch for it and it's still dead.
Sometimes they're worth defending, or is relevant in a non-obvious way, and sometimes the comment itself is discussion worthy, as it relates to the topic, even if it's wrong or seems trollish.
I think the point is to instead of using the keyboard, use the mouse to click the up arrow, and leave it at that.
(I know how tempting it is to reply quickly to something, I have the urge to just post whats going on my mind right away unfiltered. So I am very forgiving, but not everyone is)
That only tells the person who owns the comment that you appreciate it, with comment scores you're correct that almost all "I like this" comments are wrong, without comment scores then they become useful again.
Worth considering that comment scores were hidden for a reason. Exposing that information to everyone, as opposed to just the comment author, does not necessarily improve the discussion.
My favorite part is when they needed the version of the software that was used for the moon landing but they only had the source code for a previous version (scanned from giant binder) and the hash value of the version of the landing. By a series of educated guesses, by reading memos and by analysis of the source code they modified the old code the exact way so it gave them the correct hash, confirming that they correctly and exactly recreated the original code.
If you want to see debugging a computer in space, check out Apollo 13's sequel, Apollo 14. The moon landing is being held up by shorted-out switch that's causing the LM to abort the landing, and it's up to the programmers back home to figure out how to work around it in time to allow the landing.
Apollo 13 was the story of a 'successful failure', while Apollo 14 shows how hard work and creative thinking can turn failure into success.
Don Eyles' book Sunburst and Luminary has a chapter on this, and Don was primarily responsible for the Apollo 14 workaround. The book is also generally just a fantastic account of what it was like to develop software for the Apollo Guidance Computer.
Ahh, yes; I've seen the series multiple times - agreed that it's great, especially for those who enjoyed Apollo 13. I wasn't sure if there was a different Apollo-14 movie OP/Trothamel was referring to...
Was this the scenario where there was a false positive warning light but they had no way to test if it was truly a false alarm? I remember attending a talk by an Apollo engineer who convinced the control room that the switch design had a propensity to a short and it really came down to a probability-based judgement call
Honestly that movie is a lot of why I got into math and engineering.
Jim Lovell was undeniably a badass but I watched that movie and thought the heroes were the ones reading telemetry off a computer screen and using their slide rule to figure out what to do. I hope Hidden Figures does that for another generation.
From my work experience its like this:
1. Assemble commands to run
2. Run the commands and see results in the 15 minute window
3. See if you can do more commanding in the minutes you have left
4. Make a new plan, wait for next pass, and goto 1
For LEO satellites, that usually means you have 2 blocks of 3 13 minute passes, when the groundstation is in The Netherlands. For a Svalbard groundstation, you get a lot more, but still 13 minute or less passess.
I've worked on something like this, just a lot more mundane. We had Linux PCs strapped to the ceiling of various locations, mostly malls, together with a camera and projector to produce an interactive display on the floor. I had a couple of times where somebody would be onsite and the projector would be off or the display would be mangled. And it takes quite a while to get a lift to get up to the box (if it would even be allowed at that time of day), there was no network at that time, and all they had was a wireless IR keyboard that occasionally dropped keypresses.
Imagine dictating shell commands, over the phone, to a salesperson who has no idea what half the characters are that you're asking him to type, and the only output signal I could come up with was ejecting the CD tray, which was just visible from the ground...
(Note that the goal wasn't usually to fix things on the spot, it was more to triage things like whether we needed to have a replacement projector on hand, which was a big deal.)
The amount of preparation is much, much more than every other "accessible" installation. Typos are the worst to recover from, backspace usually doesnt exist.
As I've sent commands to our Linux-running satellites, its usually prepending your commands with ctrl-c characters and a couple newlines at the end, just to make sure it runs and nothing is left in the buffer.
There is also a possibility that commands get executed multiple times, and there are usually limits in transmission speed, processing speed, and frame length.
Sending a lot of characters over a terminal can cause characters to be eaten, creating typos you can't see, affecting the commanding immensely.
Isn't there some way to ensure that what you typed is what is being executed? Dropping characters from the terminal sounds terrifying.
I don't know enough about ssh and terminals to know if it's possible to type "12345" and see "12345" echoed back to me but really what the remote session sees is "1245".
Yes, terminals usually echo back the characters. In our case this would be buffered and we could request the buffer. But that would still take some operations. Best way, usually, is to send a bunch of commands in a way you ensure proper order of execution (eg write a file, check checksum of file, execute file), and make sure you can pull the logs afterwards.
Nowadays, links and systems get easier to work with, and you can sometimes have a literal TTY open to the system, like Reactor Hello World has ( https://reaktorspace.com/reaktor-hello-world/ ). However, this is over S-band, which is a 2Mbit/s link, so overhead for a stable TTY (or ethernet connection) is a lot less than using UHF/VHF.
One of the more interesting projects I ever had to complete was while awaiting clearance when I first started working for the NRO years back in ground processing algorithms (no longer work there, by the way). One of the long time guys gave us an assignment to implement the BCH error-correcting code, and then iteratively optimize it until we were able to implement the algorithm for computing the code published in a company proprietary whitepaper that was more efficient than any of the publicly published algorithms. That was just everything that is fun about programming. The actual production implementation corrected up to 8 bits per block, though I can't remember the block size any more.
Ground processing was a totally separate program and contract from mission control, though, so my team only ever received data from the satellites. We never sent data to them.
It depends on the API. If your API is "put this data over uart to the TTY", and the uart of the device is overloaded and drops characters... Or maybe mangles characters due to bitflips. Or what have you. Its all possible!
If you ever worked on a busy mainframe your compile jobs could easily be queued for 20-30 minutes. Made you much more careful to check for typos and do test runs of the code "in your head" before submitting.
Fun weirdness of even limited multilingualness: For some reason my brain first parsed this as "a pollo in real time" - or, from Spanish, "a chicken in real time".
It's worth noting that the people in that movie were WAY more loud and emotional than the real NASA engineers and operators.
You can see how NASA people react to tough situations by watching the videos of mission control during the Challenger and Columbia disasters. No shouting. No arguments. Just cool professionalism and restrained emotions.
They have a job to do, and they do it well even under stress. "Steely-eyed missile men/women" indeed.
>the people in that movie were WAY more loud and emotional than the real NASA engineers and operators.
There are plenty of NASA engineers and leaders who lose their cool. I’m only saying that so people don’t overly lionize them in a way that prevents them from pursuing a similar job because they feel they are somehow cut from a different cloth.
Everybody knows that when presented with the irrefutable evidence that the Challenger o-rings would fail, they more or less just let the astronauts die. Definitely cut from same cloth as any other org.
That’s not quite accurate. It wasn’t that there was “irrefutable” evidence that the o-rings would fail, it was there wasn’t data that they would, or wouldn’t, fail.
“The O-rings were never tested in extreme cold.”[1]
There wasn’t data which led to discussions about uncertainty, but that shouldn’t be conflated with irrefutable evidence of failure.
The obviousness of it (like many engineering failures) was only apparent in hindsight.
“Evidence, in retrospect, points to a long period of time, especially based on post-flight inspections when the joint design weakness was ‘sending a message’ and the true potential of this message was not perceived and reacted to.”[2]
“Not perceived” isn’t compatible with “irrefutable evidence that it would fail”.
I dunno. That sort of thing is exactly my job, except the remote devices are still on earth somewheres. What you'd see is me sitting in a library drinking coffee and looking at source code and schematics until I had an answer that matched the evidence.
What's in your head could be though. That's my pet theory on the movie Hackers, what we're seeing on the computer screens isn't what's actually there, it's the characters' mental constructs visualized.
In the movie Hero[0], two kung fu masters fight a battle purely in their minds. And when the mental fight was over, they only execute one physical move to finish the battle.
Think of this as the computer equivalent of that scene.
_Hero_'s always been one of my favorites. A lot of kung fu movies try to strike a balance between aesthetics and realism - I really enjoy a movie that picks one (in this case the former) and goes all in on it. It's got a fight that takes place entirely on the surface of a lake, and another that takes places in a forest of falling leaves that change color several times throughout the scene. It's an incredibly beautiful movie.
Yeah, this is strictly artistic kung fu, which is basically high-mortality ballet. There are also many "realistic" martial arts films, if that's your thing. I enjoy both styles, depending on mood.
Yes. A lot of Western action movies owe their inspiration to Chinese movies (and I suppose, viceversa). In this case Hero (2002) -- or a similar movie, since I doubt it invented this trope -- is likely an inspiration for A Game of Shadows (2011).
No, but "Debugging on a computer that is down... in space," does sound more interesting, right?
You have a computer that you can only interact with over a radio link, and need to make it start working again with only what you know about how the system is built and a limited set of remote commands. Sounds like something I'd get obsessed with solving.
I did something like that when trying to boot my brand new root server in Finland a few weeks ago (tried ~50 times while having UEFI enabled plus mdadm raid1 on GPT partitions, never worked, asked support to disable UEFI, worked).
Confirming that not being able to ping/connect to it during the failed attempts was absolutely not exciting :)
You could have a whole channel with different teams debugging satellite technology and if you're bored, it would probably be quite interesting. The bigger problem is most likely concerns about IP and secret protocols and so on.
"And now Bob will log into TeleSat123 via SSH."
<We see bob type in root / password123>
"Oups, uh..gosh we'll just go to a commercial break!"
I am halfway through the second season of For All Mankind, it's a series based on an accelerated (versus the decelerated) series of events around the time of the Apollo missions. I think you may enjoy it!
I know that, at different times, NASA has used Forth[1] and Lisp[2] in some of their space applications. Both of these languages offer REPLs that generally accelerate the debugging process, and while your "average" Lisp might be unsuitable for hard real-time applications (due to the presence of a garbage collector, usually without the hard real-time constraints that you can get out of garbage collectors with extreme effort), I wonder if they have some equivalently interactive system on-board the Hubble.
> Most of Hubble's components have redundant back-ups, so once scientists figure out the specific component that's causing the computer problem, they can remotely switch over to its back-up part.
Wait, then why don't they just switch over each component in turn? The "divide and conquer" debugging strategy.
> Wait, then why don't they just switch over each component in turn? The "divide and conquer" debugging strategy.
My guess would be that they want to try that method only if this debugging doesn't work. Imagine that there's an electrical issue in item 1 that fries item 2. If you switch over to item 2b, then you fry item 2b too!
This is exactly what happened with the Soviet Salyut 7 station. They tripped an over-current protection, didn't fix the root issue, and remotely turned the circuit back on. A series of electrical shorts then rendered the entire station without power, resulting in the need for one of the most daring station rescue stories of all time:
>"The rule of thumb is when something is working you don't change it," Hertz said. "We'd like to change as few things as possible when we bring Hubble back into service."
A lesson not taught to any modern software developer. Instead they change things all the time for no real reason other than that they want to change things.
One of the best senior engineers I worked with taught me how to run an outage. The most important thing? Stop what you are doing, take charge, and get everyone else to stop what they are doing.
The best case scenario of a bunch of engineers flailing about on a bridge turning knobs is that you luck into a fix but don't know how you got there. But you're more likely to make things worse.
In my experience, the best strategy depends a lot on the severity of the outage.
If all the alarms are going off because of a loss of redundancy, then currently there is no outage. The correct move should be carefully considered, and maybe tested in the sandbox environment.
If there is currently a 100% outage, it's best to go all out on trying every possible fix, because typically you'll restore service quicker that way. Sure, occasionally you dig yourself a deeper hole, but usually it's the best strategy.
> If there is currently a 100% outage, it's best to go all out on trying every possible fix, because typically you'll restore service quicker that way. Sure, occasionally you dig yourself a deeper hole, but usually it's the best strategy.
Almost assuredly not. If a system hits a 100% outage, there are about to be a series of cascading failures by dependent systems. If you don't even understand which system is the root cause, all you're doing is testing a bunch of never-before-tried combinations in production and hoping something works.
> If there is currently a 100% outage, it's best to go all out on trying every possible fix, because typically you'll restore service quicker that way. Sure, occasionally you dig yourself a deeper hole, but usually it's the best strategy.
Maybe. Once an outage has happened, an additional 30 minutes or hour of outage, depending 100% on the service in question, might be a bearable cost if it means preventing a domino effect of future issues caused by measures taken to restore the outage in a hurry.
This is not true. If something's working, you don't change it for no reason.
Business requirements and requests change all the time. 90% of our work is done in response to that. The other 10% is fixing up technical problems due to increased scaling or bugs found, and then basically never do we upgrade a system to keep up with security updates or change to a more modern tech.
This highly depends. I believe what Grumple is talking about is contracted work for a specific customer that is asking you for something different, even though what you've already delivered works. You're talking about consumer-facing product teams that make whatever they want and try to sell it after the fact, coming to their own decisions regarding what their "requirements" should be. When your work is funded by a specific customer, the requirements are whatever that customer says they are. And they're almost never going to let you change something just because you wanted to in order to learn a technology you can put on your resume.
> Wait, then why don't they just switch over each component in turn? The "divide and conquer" debugging strategy.
Let's say the CPU is the actual issue, but the problem manifests itself in the memory module. You swap over to the backup memory module, and suddenly the problem vanishes!
Two months later, the problem manifests again. Identical presentation. This time, there is no backup to switch over to test.
You fly a Very Expensive Mission to the telescope only to find out the CPU was the issue, and if you had figured that out originally you wouldn't be up here with four memory modules.
Let's say the memory is the actual issue, but it's manifested itself by data being corrupted and triggering undesired behaviour. Unfortunately, the corrupted state has already been written back to persistent storage.
So you swap in the backup storage module and all your problems go away. Until it happens again and corrupts _that_ too.
> while your "average" Lisp might be unsuitable for hard real-time applications (due to the presence of a garbage collector, usually without the hard real-time constraints that you can get out of garbage collectors with extreme effort), I wonder if they have some equivalently interactive system on-board the while your "average" Lisp might be unsuitable for hard real-time applications (due to the presence of a garbage collector, usually without the hard real-time constraints that you can get out of garbage collectors with extreme effort), I wonder if they have some equivalently interactive system on-board the Hubble.
This is fascinating to me. Do you have any pointers to information/research/projects focused on hard real-time garbage collection? A Lisp with hard real-time garbage collection (even if Herculean to implement) would be fantastic.
I've seen at least one implementation that uses explicit regions; before doing something with say a bunch of cons operations, you allocate a region, all the evaluation is done in the region, it returns a result to the parent region, and the region is then manually dropped, freeing the space used. Set up another region for the next large evaluation and so on.
Almost C style, and I'm sure just as error prone, but it seems like it could work.
As long as there aren't cycles of course you can do deterministic GC, it's just reference counting. It also helps if the program is single-threaded since otherwise any memory allocation/freeing can be unpredictable (since it probably locks.)
I've seen Prescheme (a subset of Scheme used to bootstrap a full-fledged dialect of Scheme) that seems to allow only stack-based allocation and therefore forbids certain types of closures that cannot be reduced via hoisting at compile time.[0]
Lisp with ref counting (assuming acyclic references) instead of GC could also be interesting to try to hack together. I have a feeling closures would be a particularly quick way to get reference cycles, so some concept of weak references may be necessary.
Another comment keyed onto a concrete example of why not, so I'll go with a presumptive reason:
Some of those elements will be part of the major service windows, and have expected operational and standby lifetimes.
So if a component with two elements has a service window of 10 years, and each element contributes to meeting that service window, then you've bumped your major service window from 10 years to a significant factor less than that.
e.g. the expected use profile might be: use element 1 for 6 years or 60% of service, switch to element 2 for 4 years, replace both during 10 year maintenance window. Interrupting that by bringing element 2 up reduces that window and contingency plans if the service window cannot be met.
I don't know, and I'm just talking out my you-know-what.
> Wait, then why don't they just switch over each component in turn? The "divide and conquer" debugging strategy.
It's better to understand the problem than to just start changing stuff hoping you find the right thing even systematically. There's not a huge rush to fix this since it's the payload computer and the telescope is still being maintained by other systems. A lot bad could happen, if the switching system is flakey you could maybe get stuck in a bad system, or if there's a number of faults you might damage one of the backups. Without the shuttle there's not a plan to service it any more so why take the risk rushing through to the most simplistic debugging method?
Forth yes, lisp not. Lisp was only used on ground to simulate the rover.
Also a repl in space only makes sense in earth orbit, but not farther away, with 8-20min waiting time for a packet roundtrip to Mars. Those machines really need proper and faster decision making (AI, think lots of `if` statements and proper modeling) on board, eg to perform landing or docking maneuvers. Or to detect and workaround radiation damage in its own circuits.
You might check on how New Horizons was implemented. They initially wanted to fly LISP, but (I think) they eventually used the LISP to generate C++.
This was to allow dynamic, model based re-planning when the craft was in the outer reaches of the solar system.
"The operations team is investigating whether the Standard Interface (STINT) hardware, which bridges communications between the computer’s Central Processing Module (CPM) and other components, or the CPM itself is responsible for the issue. The team is currently designing tests that will be run in the next few days to attempt to further isolate the problem and identify a potential solution."
So "can't figure out" sounds more like "haven't yet figured out", but they have remaining ideas to play through.
> Most of Hubble's components have redundant back-ups, so once scientists figure out the specific component that's causing the computer problem, they can remotely switch over to its back-up part.
Of course they do! I wonder if they ever had to put another part out of service. I also wonder whether the twin of the part could also suffer the same failure at the same time without being used.
And that despite regular servicing back when we still had the Space Shuttle.
Hubble started out with 6 gyroscopes, in 1996 they replaced four of them, by 1999 four had failed so they replaced all six, by 2009 three had failed again, so they replaced all six. Now they are again down to three, and one of the remaining ones has a defect that required some workarounds. The last three gyros are at least a new design that should last a bit longer.
It sounds like a specific failure of gyros, what characteristic causes that failure? Are they more susceptible to cosmic rays or something? Do you know how they've mitigated that failure?
NASA has a Satellite Servicing Capabilities Office developing on-site robotic servicing capabilities for the Hubble and other large satellites. This is connected to the On Orbit Servicing, Assembly, and Manufacturing program at NASA Goddard. They've been working toward this for a decade.[1]
No part of that effort has actually repaired anything in space.
That article specifically is popular with the Rust crowd, yes, but moreover, that roughly that number (70%-80%) has been replicated by a multiple big tech companies, not just Microsoft. Chrome was another large name.
(And yes you're right none of this has to do with this stuff, for sure.)
The probably 5 people in the world who are domain experts in the Hubble's control system don't really need a hundred backseat drivers, that won't help any.
And as someone who has been invited into "war rooms" by managers who do the "you're smart so of course you can help these other smart people stuck on a hard problem" there's a real skill to being able to read the room and know when to back the fuck out of it or just shut the fuck up -- which most intellectuals don't have. Sit and listen for awhile and take it in. Then maybe take your best idea and ask a very toned down question. If the person who seems to be leading the troubleshooting instructs you on why that's wrong, throws in 3 neighboring ideas that also don't work, with 5 reasons you haven't considered for why that's the entirely wrong path, then just nod in agreement and be quiet and see what you can learn from the domain experts.
Peppering that team with a dozen outside "experts" is going to be useless because they'll just start getting really defensive after awhile, and even if someone winds up throwing out the right solution they'll probably reflexively reject it.
OTOH if that team ASKS for someone who has expertise the team lacks and needs, then go assemble a team skilled at the use of cellphones and the internet to hunt that person down and drag them into the conversation.
Give me 5-10 years at NASA working on Hubble and I can be a domain expert useful in the room. Until then I'm a C++ expert who needs to keep his mouth shut unless asked a difficult C++ question (I wouldn't be surprised if Hubble is written in Ada and they can't possibly have a difficult C++ question).
Despite the article trying to phrase it as if they have no idea what to do, they know there computer incredibly well, it's a matter of going through the steps and isolating the problem.
I doubt we (humans) can be respectful enough to each other in truly a global event where an individual partakes in the war room. What I'm thinking here is, and I admit a rather simplistic view ... here is the problem; here is how to observe/debug, submit what you think would be the solution. This would be reviewed/vetted.
Most likely school/college/university teams knowledgeable in the subject matter would be the "individuals".
Wouldn't it be a good idea to open the source code for community inspection? Of course NASA would panic with Russia and China inspecting for "hostile" actions, but hey, if they don't know what to do, why not calling the expert reverse engineers of the world?
Idk about the payload computer, but I've got to think guidance and operation controls would have at least some remnants of legacy keyhole technology, or would expose hardware details that might still be sensitive information, even if the software was a total rewrite.
Hubble is basically a US spy satellite, pointing outwards instead of inwards. I'm sure there will be similar classes of hardware still in operation, so could be some sensitivity there.
Then hiding it when it's on Earth could make sense, actually attacking Hubble is much more complicated that anything on earth, given that you can actually put your hands on it, connecting through a JTAG and understanding what's wrong (besides spying).
Interesting thought. I would say something like the hubble transcends politcal/governmental boundaries... although I do wonder how much if its software is used in other secretive satellites.
Hubble has only lasted as long as it has because it was serviced on 4 separate occasions--although one of those was to fix a manufacturing defect. I don't think any of the spysats had service missions.
On the contrary I doubt that the US military would give up advanced imaging technology, like reading car plates from space, for nothing. That's the Hubble. Nothing else comes close.
I don't think anybody else believes that they would, either.
The question is not "is the US military still doing telescope-based surveillance?" it's "is the US military doing it with Hubble-class stuff, or are they doing it with newer stuff?"
Given the age and amount of hands-on maintenance required to keep the Hubble working, it's almost certain they're doing it with newer stuff.
I had no idea that the Hubble had been serviced by shuttle crews (multiple times even). Exactly matching orbits with a tiny object zooming around the earth at 10s of thousands of kph sounds like a very difficult and impressive feat.
Quote: "Most of Hubble's components have redundant back-ups, so once scientists figure out the specific component that's causing the computer problem, they can remotely switch over to its back-up part."
So it's time to do what every gamer does when the rig fails. Switch parts and see who's the culprit.
And yes, I do understand the next quote: <"The rule of thumb is when something is working you don't change it," Hertz said. "We'd like to change as few things as possible when we bring Hubble back into service.">
But between having nothing anymore, since Hubble had its last maintenance in 2009 (per quote: "The last time astronauts visited Hubble was in 2009 for its fifth and final servicing mission.") and have something now that definitely would fail later, I'd choose the latter.
It doesn't have anything to do with specificity, the Space Shuttle was just the only manned spacecraft powerful enough to get out to Hubble's orbit and back, and that had an airlock so you could actually access Hubble.
The Space Shuttle is the only vehicle ever built that can do in-orbit service. It's not that the Hubble is special in that regard, it is that the Shuttle was special.
ISS has airlocks that allow you to leave without removing all the air from the rest of the ship. Vehicles like a Dragon can attach their port to ISS, board, and then perform a space walk through the ISS's airlock.
Hubble is different. It's not like it's a ship that you can board. So you need two things: Ability to attach yourself to Hubble, and ability to leave Dragon to perform a spacewalk. It's not clear whether you can just have everyone in the Dragon suit up and open the hatch. And even then, you still need to attach yourself to Hubble somehow. I think you can via the port... but then you can't leave. Unless you go out the other door? Can you open that from the inside and get out with a space suit?
My rambling isn't meant to be an actual answer. It's more to show that it's wayyyy harder than "Let's just send up some people to Hubble!".
These problems could be solved. However no current space craft is designed the right way. Maybe it is a trivial modification to Dragon (making it bigger...), maybe it is better to start from scratch. That is a question for domain experts who probably haven't given the idea enough consideration to give a good answer.
> So does this also mean that the ISS is no longer able to get serviced, or are there projects to work on in-orbit service vehicles?
AIUI, The ISS can be serviced from the ISS if the appropriate supplies and personnel are sent up, but it doesn't have the delta-V to zip around other orbits servicing other satellites, so it is okay without the the shuttle for itself, but doesn't substitute for it for other things needing orbital service.
All the operating human launch systems are just capsules meant to either free fly or dock at a station, they don't have airlocks to let people out so using them for a Hubble repair would require a lot of modification and danger to use the whole capsule as an airlock. [0]
[0] Except Soyuz I guess their orbital module would allow you to keep the descent module pressurized but it's still way outside the design so there's no telling if the module would remain operational.
Soyuz has been used for spacewalks before, and the cabin is tolerant of vacuum. They haven't done that in ages, so it's possible that's been optimized out, but I'd suspect that requirement's been respected over the years.
Neat, didn't know that. There are of course other issues like how do you keep them in proximity while you do the work that could stretch days, ships don't usually free fly that close to each other for long the Hubble missions done with the shuttle all included hard captures of Hubble with the Shuttles.
i once read an article on most probably arstechnica about a nasa enthusiast who found some space probe documentation in someones garage. he goes on to actually use that to communicate with the probe and issue commands.
something like the booster had emptied or leaked or something.
i am not sure what exactly it was. that was a fascinating read
They're trying to find the broken component; what caused the issue is usually a handful of things in space. Those are usually the top causes of component failure in that environment.
Although a relatively small part of my career, I spent some time working within the quality arm of an aerospace org. A lot more on propulsion systems, but both those and satellites are usually required to meet J-STD specs for electronic builds. After a few failure investigations, you become acutely aware of the dangers of prematurely jumping to conclusions.
I think a lot of safety-critical systems even here on earth have exemptions from lead-free regulations because of this... but even lead can form whiskers.
"If this computer were in the lab, we'd be hooking up monitors and testing the inputs and outputs all over the place, and would be really quick to diagnose it," he said. "All we can do is send a command from our limited set of commands and then see what data comes out of the computer and then send that data down and try to analyze it."
"According to NASA, the 3 computers aboard the Hubble Space Telescope contain over 50,000 lines of code in the C and Assembly programming languages."
https://www.leeholmes.com/writing/hubble.pdf
I am going to go out on a limb here and post my diagnostic: There is some global counter that overflow as the system was not rebooted for a while...NASA...take it from here :-)
My hazy memory is that the NRO offered NASA a few satellite bodies that they could populate with optic systems. Since they were originally designed for monitoring earth they aren’t well suited for capturing data at much longer distances. If I recall correctly I think someone explained that it’s like trying to peer through a straw, it works, but ideally you’d want a much wider field of view. And apparently the James Web telescope handles this much better.
The NRO satellites actually have an optical design with a larger field of view than Hubble does. One of the NRO spacecraft busses is being use for the WFIRST / Nancy Grace Roman space telescope because of the wider field of view.
Comms to rovers etc use store-and-forward protocols with pre-planning of when which node will be able to see which other node. (E.g. it's calculated when which dish on earth can send a signal that will be seen by a probe around e.g. Mars, and then when the probe can downlink to a rover on the surface, and when the replies can be transmitted)
Look into "Delay-tolerant networking" for more details.
> how would an internet across the solar system work?
Realistically, each ~150 ms sphere would have its own cloud infrastructure. Those systems would then bridge with one another. So idk AWS on Earth and DogeNet on Mars.
I would love for a distributed model as much as the next guy, but it's unlikely to happen for the same reasons that it isn't happening today.
More than 150ms. There are currently satellite internet services that you can get that use geosynchronous orbits outside of that 150ms zone.
The idea is sound, but the zone needs to be a bit bigger in reality. I think the moon is close enough to earth to be in the same zone (assuming antennas on "both sides", and special routing protocols to deal with day/night cycles)
I think you have an interesting idea, but might not be thinking of the physics involved.
> The minimum distance from the Earth to Mars is about 54.6 million kilometers. The farthest apart they can be is about 401 million km. The average distance is about 225 million km.
Loosely, the speed of light is ~300,000km/s. So 182s, 1333s, and 750s as an absolute minimum length of time from end to end.
So, there are varying orbits, that's one problem. The other problem is getting items into solar orbits.
I didn't think of this, but now you have an even bigger problem of trying to keep those items in some sort of array that is in a direct line between Earth<->Mars.
If we hand-wave away that problem, the next problem is that each hop is adding latency, so, the direct answer is: no, it makes things slower, and it's a significantly harder problem than just communicating across that distance.
I'm pretty sure Hubble didn't crash due to a memory overflow. It is almost certainly a hardware failure somewhere, and Rust's memory safety won't help you if a failed bus or flaky memory chip is corrupting your data.
I am no space expert but maybe they forgot to disable Android system updates, that's what seems to have caused my Samsung Tab S2 to slow to a crawl ;-)
When dealing with high-latency, high-radiation environments, more modern isn't necessarily better: denser ICs mean greater susceptibility to radiation (and consequently more expensive hardening). They also can't exactly fly up there and swap out random bits on short notice -- I'm not sure if the US even has a the current capability to perform physical maintenance on the Hubble.
> Hertz said that because Hubble was designed to be serviced by the space shuttle and the space shuttle fleet has since been retired, there are no future plans to service the outer space observatory.
They actually use the NRO's Quasar relay satellites for this. They don't connect to the "internet," but rather to NRO mission ground stations, but they need that single point of ingress to Earth anyway because the hardware decryption modules, algorithms, and key-loading mechanisms only exist in military comms equipment, not IP routers. From there, provided the data itself is unclassified or can be downgraded, it can get to the Internet.
It's arguably an interesting question whether the government would consider using commercial relay satellites instead of just the Quasar constellation, though. The data stream doesn't need to be decrypted on the satellite, just forwarded. Obviously, you can't prevent radio from being intercepted, so throwing in a hop you don't own doesn't actually add any risk. You're totally reliant on the strength of your encryption either way.
Actually, it wouldn't be. Starlink orbits at about the same altitude, but the satellites have their radio antenna pointed downwards to Earth, so they can't connect with each other.
Last I heard laser links were in testing, and was currently only being used for communicating in the same orbit (a single, linear string of satellites all orbiting in the same plane and at the same altitude)
I'm not sure I understand this. Humor generally doesn't "contribute to a discussion"—it's purely that, humorous. I'm not sure how OP's comment contributes any less than any other joke one could have made.
I heard from an insider that it was a popup from an unlicensed copy of winrar, I have notified my neighbors and the braintrust of grandchildren, nephews, nieces and a corgi are working out a solution, unfortunately they are having problems getting minecraft, roboblox and fortnight to work with the payload software. The experts who originally wrote the code and then retired and are unable to help since the suffered from covid related 5g headaches. The management then outsourced the problem but cannot understand the contractors not because of a language barrier but because on Zoom debugging calls they are required to wear masks even though an ocean separates them. Never fear I'm dual booting Arch (BTW) and Windows 11 and have written a preliminary AI, Blockchain, ML application in Visual Basic and am on the case, it now routes through an Android app on the Amazon app store that can communicate to a Ham tower through a TNC and a Baofeng radio but I am waiting on a part from an overseas shipment. FedEx says it is still in transit on the "Ever given" which was routed through Ireland and was blocked by a creature called the waterhorse, but I gave it tree fiddy.
Programmer/analyst here and ready to help you NASA, just ask me and I'll clear a few hours of my agenda for you.
I know people say this a lot, but in this case I really think a (at least partial) rewrite in Rust of the Hubble software would be very beneficial. We could gather some of the most distinguished coders here in hacker-news and create a task force to show them the benefits of rust's memory safety.
Ok, teach me about https://github.com/rust-lang/rust/issues/82457 then. Unchecked alloca() calls have a pretty bad reputation with non-rust folks. But this is the preferred allocation scheme in Rust. I'm sure our understanding of memory safety is incompatible
I know people say this a lot, but in this case I really think a (at least partial) rewrite in Rust of the Hubble software would be very beneficial. We could gather some of the most distinguished coders here in hacker-news and create a task force to show them the benefits of rust's memory safety.
There doesn't seem to be any nuance or respect that they're trying to repair an orbiting telescope that was launched 30 years ago and designed 40 years ago - and that people are patiently trying to sort through a fully autonomous system 400 miles above the surface of the Earth with a very large set of failure options.
For me - huge props to NASA and other organizations that do this kind of work and keep these systems running for decades. I need to reboot Windows every 2-3 days