Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why skin lesions are peanuts and brain tumors harder nuts (thegradient.pub)
70 points by atg_abhishek on Nov 1, 2020 | hide | past | favorite | 8 comments


I'm an orthopedic surgeon with over two decades of coding experience and the last six years I've been working on developing deep learning models for muskuloskeletal images. Despite being in a rather narrow field, only focusing on trauma radiographs, the complexity of the task is hard to overstate. We have piloted our software with our radiologists at the hospital, but the amount of features you need so that the clinicians actually feel that there is any benefit, are huge. At the moment we're training 480+ different labels and I still think we're not even half way for trauma radiographs.


Which datasets are you using? I've only used Stanford's MURA so far.


We have our own dataset that we're annotating.


Very glad to know that people with experience are doing things like this.

The current state of medical care with regards to musculoskeletal issues is fucking abysmal, at least within the US...


There is though a lot of good research coming out of the US but from my understanding they can be rather expensive. Sweden has probably some of the best hip arthroplasty outcomes but the cost of these is roughly 5-10 times lower than in the states. I think the system is somewhat stacked against you, if doctors (also human beings) spend 80-100 hours/week during their residency, they expect a financial reward afterwards or no-one would agree to that.

Orthopedics is also somewhat of an odd medical specialty with lots of incentive issues and a scarcity of good clinical trials. In a study from Sydney they found that among the 50% of the surgeries that had been evaluated with randomized clinical trials (RCT), only in 50% of the cases the trials actually supported the surgery. It is though much more difficult to randomize people to surgery, doing the blue or red pill is much easier.

I'm pretty sure also that a big part of the problem is us doctors failing to identify if a patient is part of the long tail or not, i.e. does he/she not fall into a particular study's inclusion criteria or not? Hopefully we will be able to shortcut this problem with deep learning tools but as most self-driving-car enthusiast know - there is a big difference driving around in the parking lot from the open streets.


Geoffrey Hinton even went as far as stating that we should stop training radiologists immediately because deep neural networks will likely outperform humans in all areas, five or ten years from now.

Diagnostics have become very valuable, but at the expense of good old-fashioned medical wisdom.

Doctors used to be some of the smartest, wisest and most educated people in town. Most settlements were fairly small, so people generally tended to know a lot about other people around them and there was less variability of lifestyle.

Doctors would visit you at home with their little black bag. Dr. McCoy of the original Star Trek was sort of conceived on that model and from there we get the futuristic "black bag": his tricorder. He would aim it at you anywhere he happened to be and diagnose you.

Later Star Trek series were more prone to putting a person in a machine more like modern diagnostic machines. They relied less heavily on the tricorder.

With the rise of diagnostic machines, you see the decline of home visits because the patient now needs to go where the tech is. A side effect: The doctor no longer casually observes details about your home life and lifestyle without having to specifically ask and we don't seem to think this is significant.

We now act like human bodies are specimens in a petri dish and their body is separate from their life. This comes with some inherent problems.

The diagnostics are amazing and wonderful, but we really need to find ways to counteract some of the downside involved in how this is shaping medicine. No, we do not need to just remove humans entirely and let machines diagnose us, please.

Potentially also of interest (though it's from five years ago):

Suddenly, a leopard print sofa appears

https://news.ycombinator.com/item?id=9749660


Skin lesions are still not peanuts. There are 0 FDA approved machine-learning diagnostics or other management technologies in dermatology. There is no widespread clinical use of CNNs for skin lesion diagnosis even at top-tier university hospitals. With more reliable studies, hopefully this will change. But in terms of eating the nuts after having proverbially cracked the shell with a couple of somewhat promising papers, we are 0% there yet in the clinic and there seems to be good reason for that.

The cited 2017 Stanford study was a landmark in principle but (1) was not making an exceedingly relevant comparison and also (2) would need out of sample validation from other image sources for people to use it. Regarding (1), ideally one would compare CNN evaluation to a real-world clinical evaluation (evaluation of texture and extremely 'hi-res' magnified exam +/- polarized light), rather than to dermatologists also looking at a grainy 299x299 pixel image. But the preferred study would be difficult or impossible to coordinate, so at least regarding (2) having high out-of-sample performance would need to be proven, such as on sample from an entirely different institution or group of institutions, and that step has been a great stumbling block in many trained-networks in medicine. I don't know what the status for their network is.

The other curious issue is representation. Even if the model performs well on 100,000 images from mostly-Caucasian patients, would I trust it if my skin were brown or black or had background dyspigmentation from a chronic skin disorder? Eek, not sure. A CNN training on patients from Japan should not be expected to work for skin lesions from randomly chosen people in South Africa or Sweden. This is a most obvious case for needing actively inclusive representation in neural network training. One way or another massive number of heterogeneously acquired images are needed from a huge range of people. For underrepresented skin types or for lesions in underrepresented contexts (overlapping with a scar or in a tattoo or on scrotal or perianal skin) many clinicians and astute patients would be happier if there was a Bayesian network with a prediction confidence interval rather than having to deal with whatever pops out of the softmax.

The fields where there is more success at least in terms of getting technologies approved for use (but also probably in terms of more active research) are here, and many relate to radiology: https://www.nature.com/articles/s41746-020-00324-0/tables/2


It's a very different world but i relate to this a ton coming from biomedical EM. This article does an awesome job explaining what makes biomedical computer vision so hard.

There's huge opportunity for automation, a wide range of vision challenges, and very cool rewards if challenges can be met. We're working now on building 3d nanoscale structural models of COVID vs. control blood clots and modern hardware leaves tons of data on the table for better software to take advantage of. The relevant vision problems are very challenging, and even just within EM, robust computer vision that isn't built up around individual narrow high-visibility problem domains (like neural circuit tracing) has a long way to go.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: