Bench Press

The Crossroads of Science and Tech

Archive for the ‘Uncategorized’ Category

A blow to HPC enthusiasts

View Comments

As Ars Technica reports, Sony has decided to pull the plug on experimentation with non-game-related software on the Playstation 3. The latest software updates to the Playstation prevent the user from installing Linux (even on older models that previously could do so), which is the first step for making the PS3 into a more general computer capable of more than playing games.

We’ve mentioned the Playstation 3′s enormous computational potential in the past, and so we’re sad to see this capability disappear. It’s easy to understand the need for the changes from Sony’s perspective; Linux installation allowed for a lot more video game piracy, which is especially important considering that Sony makes very little money from the PS3 unit itself. The PS3 is most likely a loss-leader for Sony’s much more lucrative game licensing business, so Sony decided the cost of supporting the relatively tiny community of researchers and hobbyists just isn’t worth the hit in revenue from allowing video game piracy.

In any case, the newest PS3s don’t have support for general Linux installation in any case, so the overall impact of this software update will probably be limited. Existing users won’t need to install the software unless they really want to play games online on the Playstation Network, and newer users won’t be missing much anyway. Still, the older PS3 was a relatively inexpensive way to obtain a test-machine that had IBM’s Cell processor, since Sony could aggressively price their PS3s through large economies of scale. It’s just too bad that the newest PS3s and older PS3s with the new software won’t be able to contribute to massive scientific computational projects for the betterment of mankind.

Written by Eric

March 29th, 2010 at 3:28 pm

Posted in Uncategorized

Making Coding Fun Again

View Comments

frustratedIf you’re like me, and you’ve spent endless hours programming in front of your compute , you’d agree with me that sometimes it’s not the funnest thing to do. While the finished product might be really cool, getting there is oftentimes tedious, frustrating, and hair-splitting. What usually causes these problems is that coders get bogged down with the details due to the fact that certain blocks of code require unavoidably intricate and detailed logic. However, with the new EU-funded research project, ReDSeeDS, Michal Smialek and his team of researchers hope to lighten the burden for all coders and make coding enjoyable again.

What Smialek and his team discovered was that since most coders start from scratch when beginning a project, these programmers often need to code entire programs from the ground up despite the fact that other people have probably previously coded programs which accomplished similar tasks. As a result, programmers often re-write code simply because working on a project usually means starting from a blank screen. In order to avoid this, Smialek aims to create a repository which will house previously written code stored with a list of the program’s aims and requirements. Thus, when a user searches this database with a query of requirements, the database will return previously written code that is expected to produce similar outputs. These functionally equivalent snippets of code are called “artefacts,” and Smialek’s database is essentially a library of artefacts.

In a project, you may produce several artefacts which are design blueprints and then an artefact which is the code that tells the system how to work. The final program is also an artefact which is served by the other artefacts – that is the design and the code.

While most programs obviously will not completely overlap in terms of requirements, the fact that most programmers will not need to “re-invent the wheel” will greatly enhance productivity and allow them to not get so bogged down with the details. Not only does this allow for a more rapid rate of software releases, it vastly decreases the amount of time needed to fix problems within the code simply because the code within the database will be (we’d like to hope) error-free. Smialek notes that:

What it will do as a commercial product is to reduce considerably the amount of work required to develop a new software application, and that means the ability to develop more and larger systems using the same human resources.

(Image Credit)

Written by Kevin

December 7th, 2009 at 7:00 am

Posted in Uncategorized

Meet the Elements

View Comments

One of my favorite bands is They Might Be Giants, famous for quirky songs including one of my science-related favorites “Why Does the Sun Shine?” with the amazing line which caught my interest immediately when I first heard it:

“The sun is a mass of incandescent gas, a gigantic nuclear furnace! Where hydrogen is built into helium at a temperature of millions of degrees”

It is with great pleasure that I stumbled on (courtesy of Mr. Gunn’s FriendFeed) one of their latest creations, “Meet the Elements” which I have embedded below for your enjoyment:

Awesome!

If you’d like to follow the Bench Press authors on Friendfeed you can follow me at http://www.friendfeed.com/benjamintseng, Kevin at http://friendfeed.com/ktseng, Eric at http://friendfeed.com/ericsuh, and Anthony at http://friendfeed.com/atphan.

Written by ben

September 17th, 2009 at 7:00 am

The Life and Death of a News Article

View Comments

lipstickonap

The heartbeat of the news.

Ever since June 25, 2009, Michael Jackson’s death has been the talk of the nation, his face plastered over web articles, newspapers, and television stations. His death broke the record for the number of users on Yahoo news at any one point in time, topping even President Barack Obama’s inauguration, and even Google believed its servers were under attack due to the sudden spike in web searches for the moon-walking legend. However, have you ever wondered why the news of the King of Pop’s untimely death has stayed in the media for so long, while other news topics, such as the death of another cultural icon, Farrah Fawcett, quickly died out?

Jon Kleinberg, Jure Leskovec, and Lars Backstroma, from the computer science department at Cornell, sought to answer these types of questions by tracking the life-cycle of news articles for a three month period during 2008. Their research included 20,000 mainstream media sites and over 90 million articles. Using a complex algorithm which could identify certain phrases in different news articles such that the computer could mark them as being of the same subject (a task that has proven to be very difficult time and time again), the team tracked the movement of news using across blogs and news sites across the Internet. Armed with an extensive pool of data to sift through and analyze, the three researchers discovered an astounding pattern that was shared throughout most news topics.

They found a consistent rhythm as stories rose into prominence and then fell off over just a few days, with a “heartbeat” pattern of handoffs between blogs and mainstream media. In mainstream media, they found, a story rises to prominence slowly then dies quickly; in the blogosphere, stories rise in popularity very quickly but then stay around longer, as discussion goes back and forth. Eventually though, almost every story is pushed aside by something newer.

Before research like this was done, many editors and journalists perceived something they described to be a “news cycle.” However, with no quantifiable data, there was no way to be confident whether this was just their perceptions or an actual phenomenon. With the information collected by these Cornell researchers, they believe the latter to be the case and have started to create mathematical models which would accurately describe the life-cycle of news.

The slow rise of a new story in the mainstream, the researchers suggest, results from imitation – as more sites carried a story, other sites were more likely to pick it up. But the life of a story is limited, as new stories quickly push out the old. A mathematical model based on the interaction of imitation and recency predicted the pattern fairly well, the researchers said, while predictions based on either imitation or recency alone couldn’t come close.

This type of news excites me because it shows how technology and the Internet have produced a tangible result (in this case, a physical model to the life cycle of a news article) to a question that would have been unsolvable 20 years ago. Truly the capabilities of technology to solve even the most abstract problems are limitless.

(Image Credit)

Written by Kevin

July 16th, 2009 at 6:00 am

Are you positive it’s positive?

View Comments

As genomes have been sequenced over the past few decades scientists have looked for new ways to analyze and interpret the wealth of information. They’ve developed numerous algorithms with goals ranging from organizing evolutionary family trees (inspired by plagiarism detecting software) to aligning genetic sequences. All of this to answer the numerous questions that can now be asked thanks to sequence databases. One of the many things scientists have attempted to study is positive selection in protein-coding genes.

Positive selection of advantageous gene mutation is particularly interesting to scientists as it can provide insight into the function of new genes. However, positive selection is difficult to detect and analyze as neutral and deleterious mutations predominate advantageous mutations in frequency. Initially scientists looked for positive selection by simply comparing the ratio (/omega) of nonsynonymous nucleotide substitutions (dN) to the number of synonymous nucleotide substitutions (dS) between homologous protein-coding gene sequences while utilizing Fisher exact tests to accept or reject a null hypothesis of neutral selection1.

Over the years scientists developed additional statistical analyses to infer positive selection. Two of the most popular methods are the branch-site method (BSM) and site-specific method. The BSM utilizes a likelihood ratio test to detect positive selection within a given phylogenic branch. The site-specific method on the other hand utilizes /omega to look for specific amino acid substitutions that are positively selected. Both of these methods have been utilized in hundreds of papers and seemingly provided a great deal of insight into potential points of positive selection within various genomes. What would you say then when told that both of these methods contain significant flaws which provide an inordinate number of false positives?

fig

Bovine Rhodopsin protein with predicted sites in red and experimentally determined in blue. (Adapted from Yokoyama et al. 2008 PNAS)

That’s exactly what Masatoshi Nei and his group believe to have shown in a recent paper evaluating the reliability of the branch-site and site-specific methods. Nei’s group utilized several controlled computer simulations as well as data collected by Shozo Yokoyama, at Emory University, on dim-light vision opsins in vertebrates2 in their studies determining that both the branch-site and site-specific methods yielded far too many false positives. Nei and his group contend:

This low rate of predictability occurs because most of the current statistical methods are designed to identify codon sites with high /omega values, which may not have anything to do with functional changes. The codon sites showing functional changes generally do not show a high /omega value. To understand adaptive evolution, some form of experimental confirmation is necessary.

From this paper it looks like scientists looking for high /omega values may have been chasing ghosts by assuming that amino acid changes result in functional changes indicating proof of positive selection. The potential impact this will have on hundreds of papers is stunning. In the end the take home message is that statistical analyses, no matter how elegant, have their limits and ought to be utilized in conjunction with experimental data as much as possible.

(Sources: 1 – Reliabilities of identifying positive selection by the branch-site and the site-prediction methods , 2 – Elucidation of phenotypic adaptations: Molecular analyses of dim-light vision proteins in vertebrates )

updated: Had to change all the &omega to /omega because WordPress kept changing it into ? for some reason…bah

Written by Anthony

April 21st, 2009 at 12:37 am

To Stimulate Open Science

View Comments

A lot of scientific circles are talking about how best to spur collaboration, and that’s spawned a number of movements, such as “open access” and “open science” — both inspired by the “open source” movement in programming — that fight to end the fencing of science into proprietary, commercial enclaves that require fees to access. Clearly, in terms of fostering the trade of knowledge, an open, free highway is better than a highway with a large toll.

Although much of this movement towards open science has focused on journals and their large subscription fees, there’s another area of open science that’s drawn my attention: Gene Ontology (GO) annotations, which are a set of standardized annotations to classify genes according to their biological, such as “amino acid metabolism.” These annotations are, as of now, curated by experts. What I’ve noticed in particular is that GO has thrived in one community, and withered in another, and I’m curious as to why.

The yeast community is famous amongst all the molecular biology communities as being open and collaborative, to the extent that almost all gene names have been systematized, annotations for genes are very extensive and well-structured, a strain is available for the deletion of every gene, many genes are available fused to a fluorescent marker for easy microscopy, and so on. Just go to the Saccharomyces Genome Database, and there’s a wealth of all this sort of information at your fingertips, centralized, standardized, interconnected, and easy to use. In particular, the Gene Ontology annotations are considered superb and accurate, allowing for easy computational interpretation of large-scale experiments involving hundreds and thousands of genes and their interactions. Yeast genomicists use GO all the time, and contribute to its development very often.

In contrast, the human Gene Ontology annotations are considered sparse and relatively uninformative, and generally they aren’t quite as useful for interpreting things like gene expression microarrays. Instead, one of the most successful and popular sets of biological function annotations is called Ingenuity, which is a commercial software package, well developed by the large amount of money poured into it by pharmaceutical companies and other health science research and development.

Why did the two communities end up going in two directions, one towards a more collaborative, “open science”-friendly annotation system, and the other towards a proprietary, commercial annotation platform? Undoubtedly, part of the reason is the structure of financial incentives; human biology has unique opportunities for direct commercialization via drug or health research, and so people would naturally focus their efforts on things that can win them fortune. But the first yeast biology research done by Louis Pasteur was probably related to budding (pun intended) commercial R&D on reproducible bread/wine/beer recipes, so what prevented the yeast community from, say, balkanizing yeast research because of incentives from the beer brewing and bread-making industries?

Perhaps it is because the yeast community arrived at common standards and nomenclature for information sharing long before it got very large. After all, yeast doesn’t nearly have the same problem of having multiple names for the same genes that humans do (just look at the gene RANKL, which is also known as OPGL, ODF, CD254, TNFSF11, TRANCE, and hRANKL2). They also don’t have nearly as much of a problem with the explosion of gene database IDs (humans have, as a small sample: RefSeq, HGNC, Ensembl, EMBL/GenBank, Entrez, MIM, Unigene, UniProt/SwissProt, and UCSC). Perhaps having a common, universal standards-making institution is the answer, to make sure all the railroad tracks are the same width, to use an analogy.

Or perhaps its the size of the community. There are many, many more labs studying human biology than yeast biology, not only because of the financial incentives, but also because of the huge size of the human genome (1000 times bigger than the yeast genome). Maybe it’s just easier to coordinate fewer people into one community.

I think as the scientific community moves forward, especially in embracing new collaborative methods on the internet, we should closely examine what’s worked so far and what hasn’t, so that we don’t end up fording through endless patents, fees, and proprietary, non-interoperable data structures to get what we need.

Targeted Drug Delivery

View Comments

Today modern medicine provides patients with numerous drugs for an enormous number of health issues. For example, getting relief from a headache can be as simple as popping open a bottle of aspirin and swallowing a couple pills. While to the patient the delivery of the drug begins and ends with swallowing those two pills with a glass of water, to the scientists working on the drug that’s simply the beginning of numerous steps that hopefully result in a drug surviving the trip through the body to it’s intended target and doing it’s job.

Drugs are therefore designed not just to solve a problem but to survive the human body’s natural mechanisms. The gauntlet of obstacles that a drug faces upon entry into the body is a major reason why many researchers continue to look into innovative techniques for delivering pharmaceuticals.

That’s where research being conducted by Drs. Stefan Franzen and Steve Lommel comes in. Working with the red clover necrotic mosaic virus (RCNMV), Drs. Franzen and Lommel have developed a potential revolutionary drug delivery platform.

franzenpvn2008rcnmv

Figure 1. Production of Drug Vector

Drs. Franzen and Lommel take advantage of a 17 nanometer space within the 38 nanometer icosahedral capsid of RCNMV in order to store therapeutics. The RCNMV infused with the drugs could then be used to deliver the drugs in a cell specific manner with the addition of targeting peptides.

The preparation of the drug carrying virus is elegant in it’s simplicity and produces a robust delivery mechanism (See Fig 1). First RCNMV is treated with EDTA to open pores in the capsid. Next therapeutics are infused through these open pores. The pores are then sealed with Ca²+ which is key in releasing the drug later upon viral entry to the cell. The prepared virus can then be purified via dialysis followed by adding target specific peptides.

The elegance of using Ca²+ to seal the pores lies in the fact that the human bloodstream is abundant in calcium. Inside cells, calcium levels are much lower, allowing the pores to open up thereby delivering the infused therapeutics only when the target cell has been entered.

In vitro work with Doxorubicin, a cancer drug, infused RCNMV shows promising results (see Fig 2.) promoting apoptosis only when provided with targeting peptides allowing the drug to be delivered to the interior of cells.

Figure 2. Delivery of Doxorubicin RCNMV to HeLa cells

Figure 2. Delivery of Doxorubicin RCNMV to HeLa cells

A potential application of this research is in cancer treatment. Current chemotherapy treatments often result in dramatic side effects as the drugs do not distinguish between diseased and healthy cells. While these results are probably still years from resulting in a commercial therapy it provides hope that in the near future doctors will be able to prescribe chemotherapy treatments with dramatically reduced side effects thanks to target specific delivery of the drugs.

(Sources: NCSU – results. , NCSU News , Franzen Presentation – Plant Virus Nanotechnology)

Written by Anthony

February 17th, 2009 at 7:55 pm

Seam carving will be in Photoshop CS4!

View Comments

I always find it fascinating how fast technology moves; in this case, “seam carving”, which is a clever algorithm for removing or replicating low-content spaces in images, is now going to be in Photoshop CS4 as “content aware scaling”! Pretty awesome for an algorithm that only got presented last year.

Written by Eric

October 11th, 2008 at 9:49 pm

Posted in Uncategorized

In Search of the Darwin particle!

View Comments

I’ve been told biologists are just haters, but who needs to know why things have mass when we can find the Darwin particle!

Written by Anthony

September 11th, 2008 at 2:30 pm

Posted in Uncategorized

Tagged with , , ,