Mechanical issues – flow and sheer over cellular surfaces–are known to influence genes that are important to human health. The cells that line the surfaces of blood vessels play crucial roles in lethal disorders – for example, in aneurysms – and particular genes in these cells have been known for some while to change their expression in response to mechanical changes, in particular to changes in liquid flow across their surfaces. In the third and concluding part of the three-part series on the subject, Ashoka gives us unique insight into scientific revolutions, in the weekly column, exclusively in Different Truths.
About 18 years ago, Tim Hammond, a physician and research scientist at Tulane, flew samples of kidney tissue in the space shuttle. When his samples, which had been chemically frozen, while in microgravity, returned to Earth, Hammond and his collaborators measured the expression of thousands of genes within the tissue. They found that a large proportion expressed very differently from the genes within the Earth-bound samples of the same tissue, no matter how the Earth-bound tissue had been mechanically treated. Acceleration or low-gravity or something else as yet unknown about the shuttle environment affected gene behaviour. If, as seems likeliest, the effect Hammond discovered is an essentially mechanical effect of low gravity, it has important implications for long-term habitation in space, on the Moon and Mars.
Mechanical issues – flow and sheer over cellular surfaces – are known to influence genes that are important to human health. The cells that line the surfaces of blood vessels play crucial roles in lethal disorders – for example, in aneurysms – and particular genes in these cells have been known for some while to change their expression in response to mechanical changes, in particular to changes in liquid flow across their surfaces. David Peters, a young biologist at the University of Pittsburgh, and his colleagues measured the change in gene expression in response to changes of flow for almost all genes in living human cells lining blood vessels. In their experiment, more than a hundred genes changed, including some known to be involved in cellular structure. Peters and his collaborators are now measuring all of the genes in such cells that respond to changes in pressure and flow.
The few cases I have briefly described here are merely samples of a trend that can be seen in several sciences – a trend to which we can also attribute the Virtual Observatory that is planned to enable astronomers to search and analyse vast data stores taken by remote instruments; and, in climate studies, the Earth observation satellites that now send down several gigabytes of data each day–data that is increasingly being used to monitor the state of the planet, to locate causes of change, and to forecast changes in the environment. Ever new techniques make possible the measurement of ever larger quantities of data; data manipulation software makes possible the selection of samples that are relevant to particular problems; automated search and statistical techniques help guide researchers through the super-astronomical array of possible hypotheses.
Kuhn said that scientific revolutions generally meet fierce resistance – and the automation of discovery in science is no exception. In some cases, the animosity stems from nothing more than conservatism, an effort to preserve academic turf, or plain old snobbery. Above all, automated science competes with a grand craft tradition that assumes that science progresses only by scientists advancing a single hypothesis, or a small set of alternative hypotheses, and then devising a variety of experiments to test it. This tradition, most famously articulated by Sir Karl Popper, is championed by many historians and philosophers of science and resonates with the accounts of science that many senior scientists learned in graduate school.
While the history of science can serve as an argument for norms of practice, for several reasons it is not a very good argument. The historical success of researchers working without computers, search algorithms, and modern measurement techniques has no rational bearing at all on whether such methods are optimal, or even feasible, for researchers working today. It certainly says nothing about the rationality of alternative methods of inquiry. Neither was nor is implies ought.
The ‘Popperian’ method of trial and error dominated science from the sixteenth through the twentieth century not because the method was ideal, but because of human limitations, including limitations in our ability to compute. Historically, novel methods and stratagems were devised from time to time to get round computational limitations. For example, in the eighteenth century, Leonard Euler, perhaps the most prolific mathematician ever, could not reconcile seventy-five observations because the calculations required far too many steps; statistical estimation of theoretical parameters, introduced by Legendre, in 1808, in a form known as ‘least squares,’ permitted the reconciliation of (for the time) large quantities of data, such as the seventy-five that defeated Euler. The quick adoption of factor analysis in the 1940s was due in part to computational tractability, and one could argue that the same is true of the enormous influence of Sir Ronald Fisher’s statistical methods.
When scientists seek to learn new, interesting truths, to find important patterns hiding in vast arrays of data, they are often trying to do something like searching for a needle in a really huge haystack of falsehoods, for a correct network among many possible networks, for a robust pattern among many apparent but unreal patterns.
So how does one find a needle in a haystack?
- Pick something out of the haystack. Subject it to a severe test, e.g., see if it has a hole in one end. If so, conjecture it’s a needle; otherwise, pick something else out of the haystack and try again. Continue until you find the needle or until civilisation comes to an end.
- Pick something you like out of the haystack. Subject it to a test. If it doesn’t pass the test, find a weaker test (e.g., is the thing long and narrow?) that it can pass.
- Try 1 for a while, and if no needle turns up, forget about needles and start studying hay.
- Try 1 for a while, and if no needle turns up, change the meaning of needle so that a lot of ‘needles’ turn up in the haystack.
- Set the haystack on fire and blow away the ashes to find the needle.
- Run a magnet through the haystack.
Method 1 is still the standard description of how science is and should be conducted – the account we find explicitly in the introductory chapters of science textbooks and implicitly in the criticisms some scientists and methodologists express toward other ways of doing things.
Method 2 is practiced and effectively advocated by many social scientists (you need only replace ‘something you like’ in 2 with ‘theory’).
Methods 3 and 4 are the practices that postmodernists claim science does and should follow.
Methods 5 and 6 are those made possible by the automation of discovery.
In principle, methods 5 and 6 are a lot smarter than the other methods, but they are not without limitations both real and metaphorical. Burn the whole haystack and you might melt the needle. And that is a sound worry about automating science: it may rush things, sometimes too much. Because a procedure for finding hypotheses is fast and can be done by computer doesn’t mean the procedure gives good results. Figuring out what a method can and cannot reliably do requires hard work.
Consider for example the problem of identifying networks of gene regulation. The ability to measure gene expression simultaneously for thousands of genes in normal and perturbed genomes (in perturbed genomes, particular genes have either been deleted or forced to over-express) invited the application of computer methods that search for causal networks. Algorithms were proposed for piecing together networks from comparisons of gene expression measurements in cell lines with perturbed and unperturbed genomes; algorithms were proposed for finding networks from correlations with repeated measurements of expression levels in unperturbed networks–and they did very well on data produced by computer simulations of gene expression.
It turns out, however, that much of this work proved to be illusory. The algorithm for assembling a network from perturbation effects was incorrect. The algorithms for inferring networks from correlations of gene expressions overlooked the fact that measuring expression levels in aggregates of cells (rather than in individual cells, which is technically feasible but rarely done) creates correlations due entirely to the aggregation itself rather than to the influence of particular genes on the expression levels of others. The simulations that seemed to work so well also turned out to be simulations of measurements at the level of individual cells–measurements of a kind usually not made in reality. Undoubtedly the automated procedures got some things right, but very likely what they got right was cherry picking– gene connections indicated by very large changes in expression levels or very large correlations. A real advance in unravelling gene regulation networks came recently–by chemical rather than by computer automation. Tong Ihn Lee and his colleagues found a way to identify a large fraction of the genes in yeast that are, in turn, directly regulated by genes known to be regulators. They did so for more than a hundred regulator genes, effectively identifying a good piece of the regulatory structure in ‘wild type’ yeast.
The automation of learning, whether by computer or by new laboratory techniques, does not render human judgment obsolete, or marginalise scientific creativity. Nor does it cheapen the sweat and effort, the insight and ingenuity of human scientists, but shifts them toward the consideration of algorithms that can efficiently and reliably compare many hypotheses with vast quantities of data and toward laboratory methods that answer many questions at once.
©Ashoka Jahnavi Prasad
#ScientificRevolutions #Mechanicalissues #Aneurysm #Algorithms #GeneRegulation #MidweekMusing #DifferentTruths