Bench Press

The Crossroads of Science and Tech

Archive for the ‘Supercomputer’ tag

This…Is…Jeopardy!

View Comments

imageIntroducing today’s contestants…the IBM QA system Watson? That’s right folks. Our friends at IBM, not content with simply creating a supercomputer capable of defeating humans at Go, have taken it a step further and are currently creating a supercomputer (codenamed Watson) with the goal of it having the ability to beat humans at a game of Jeopardy!. [IBM Video on Watson at the bottom]

The interesting thing about this particular problem is, unlike with games of Go and Chess which have clearly defined rules and discrete moves/outcomes, playing a game of Jeopardy requires an understanding of semantics which has traditionally been relegated to the human domain.

Admittedly, there are natural language processing solutions out there. But at the end of the day, we’re a long way off from the computers displayed in Star Trek which can:

  • Understand spoken words – This is a very challenging problem. How do you instruct a computer to not just comprehend words, but comprehend actual meaning to those words (semantics). The ability to understand that the “can” in “I can do it” is very different from the “can” in “soda can”, or that “being on pins and needles” is just an expression, or to even understand when a sentence is a question versus a statement are very deep problems. But this is only the beginning of Watson’s challenges, for Watson must also be able to…
  • Search a massive database for relevant information – Merely searching a database for a list of possible results is a tractable problem that many database/search engines have already solved (e.g. searching for “Indian economy” on Google’s search engine). Searching a large database to find a particular answer behind the reams of data is much harder (e.g. understanding that “Economic Output” can be measured by a country’s GDP).
  • Understand the relevant information – Just as it’s harder to understand Quantum Theory than it is to merely read the papers, IBM’s Watson must be able to parse the information that it’s found from its database. For instance, if asked to compare India’s economic output to its neighbors, a computer must not only understand that economic output is GDP, it must also understand what “neighbors” means in the context of India, understand that GDP may be “real” or “nominal” and may need to be adjusted by currency, and understand what it means to “compare” GDP’s.
  • Formulate a response – This is related to the first ask, but is more challenging. Just as its harder to memorize the Bible than it is to recognize specific passages, IBM’s Watson must do more than just recognize/understand words – it must be able to create its own sentences which use the relevant information and understanding its developed.

The task is challenging, but not impossible. Already, researchers have demonstrated computers which have been able to do the scientific method (hypothesize –> experiment/test –> analyze –> formulate new hypotheses) all on their own. Granted, the scientific problem explored was more systematic in nature (and had a more well-defined solution set than a game of Jeopardy) as it was focused on finding missing pieces in metabolic networks, but the fact that a computer was capable of performing basic high level logic is very promising for fields of research (although threatening to lab techs and uncreative grad students everywhere) which were formerly intractable due to their scope (e.g. mapping out the human proteome or transcriptome).

“The essence of making decisions is recognizing patterns in vast amounts of data, sorting through choices and options, and responding quickly and accurately,” said Samuel J. Palmisano, Chairman, President and Chief Executive Officer. “Watson is a compelling example of how the planet—companies, industries, cities—is becoming smarter. With advanced and deep analytics, we can infuse business and societal systems with intelligence. This project is the latest example of IBM’s longstanding commitment to fundamental research and to overcoming ‘grand challenges’ in science and technology.”

Although I don’t know how well Watson would fare against Ken Jennings, Watson’s completion would be a landmark in artificial intelligence. It’ll be interesting to see if IBM’s Watson does as well as IBM promises. Although I don’t know how well Watson would fare against Ken Jennings, Watson’s completion would be a landmark in artificial intelligence and computer science. Watson may pave the way to an age where computers can actively aid doctors diagnose patients or help business executives make financial decisions (which is probably what IBM is going for here).

(Image Credit) (Video)

Written by Kevin

May 19th, 2009 at 10:05 am

Go Computers Go

View Comments

Many problems in science require computing power which goes beyond mere number crunching and extends into the realm of “artificial intelligence.” For years, what researchers considered to be the ultimate test of artificial intelligence was the ability to defeat a chess pro (something which was formally resolved in the favor of our machine overlords when IBM’s Deep Blue beat out reigning chess champ Gary Kasparov). I’ve always been confused by this for, as complex as chess is, there is a game which exists which is so much more complex than chess it makes chess look like a game of tic-tac-toe.

image

That game is Go. Invented in China and reaching the West through Japan, Go is a game with relatively simple rules, but requires a depth of understanding to master (personal note: one of the brains behind Bench Press, Eric, is an avid fan of the game) given the intricacies of gameplay and the structure of the board. This makes it exceedingly difficult for a computer using brute-force methods to defeat even a moderately skilled Go player. For instance:

  • In chess, different pieces have specific limitations on where they can move. There are very few limitations on where Go pieces can be placed on a board.
  • In chess, pieces can be removed via capture, and those pieces can never return. No such limitation exists in Go. A piece can be removed, but it can just as easily come back in a later turn.
  • The Go board is a 19 x 19 grid compared to a chess board which merely has 8 x 8 squares. This translates into ~100-200 possible moves each turn of Go compared with ~30-40 in chess.

What does this mean? A quick Wikipedia search shows that an estimated 10170 possible end-states and 10360-10700 possible games (compared to a measly 1050 end-states and 10120 games for chess). To give a sense of how large these numbers are, there are an estimated 1080 atoms in the observable universe!

The sheer complexity of the game and the ability of human masters to intuitively understand and visualize the board (as Dartmouth artificial intelligence professor Bob Hearn puts it in an interview with Wired Science, “Go is a game of living things, and you talk about it that way, as if the patterns might be alive”) led many to believe that developing a program capable of beating humans at Go would thus be a high sign of artificial intelligence. After all, what computer can possibly brute force search far enough ahead in a game of Go to beat a human?

Well, as it turns out, creating a computer algorithm that can understand a game of Go is exceedingly difficult. But, while ingenuity and intuition are difficult for computers, simulating it with number-crunching on carefully conducted statistical simulations is in a computer’s list of tricks. New programs based on compiling the results of millions of Monte Carlo simulations (a computational technique revolving around crunching the results of many random tests) have succeeded where dozens of previous attempts at introducing human-pattern-recognition heuristics failed. Instead of attempting to analyze every possible move or feably trying to understand the layout of a game, these Monte Carlo Go programs crunch through the results of their random games to determine quick statistical rules of play which help guide still further Monte Carlo simulations – the result of which is a computer which gets more and more knowledgeable about how Go games may result.

imageimage

The result? On August 7, 2008, for the first time in history, the Monte Carlo Go program MoGo beat 8-dan (the second highest ranking possible) professional Go player Kim Myungwang. In all fairness to Kim, the program had a 9-stone handicap (a lead you give a beginner). But, I think the key takeaway that can be learned here is the power of statistical algorithms to mimick (and potentially surpass) human ingenuity.

And, with new methods making supercomputer power much more accessible like crowdsourcing, distributed computing, and alternative chip architectures, that’s something which scientists, doctors, and engineers may all hopefully benefit from in the near future.

(Image Credit) (Image Credit)

Written by ben

March 30th, 2009 at 9:45 am

Distribute compute

View Comments

As the problems scientists solve become more and more complex, so do their demands for computational power. One approach to addressing this has been to build faster, more powerful computers, potentially with chips better suited to performing advanced calculations (like graphics cards or IBM’s Cell processor). But, this approach has serious limitations — mainly that it’s expensive to build and to maintain these supercomputers.

Some researchers, however, have turned to a radically different approach. Instead of building a bigger, better mousetrap to deal with more mice, the distributed computing approach takes the approach of placing many small, cheap mousetraps. The result is cheap “supercomputers” which are able to “pool” the computing power of many computers connected over a network.

This approach has been used by projects like Folding@Home and SETI@Home which are able to combine computing power from volunteers over the internet to do the number-crunching needed to simulate protein folding or scan deep space for extraterrestrial life. SETI@Home was the first such large-scale distributed computing platform. This platform, now the Berkeley Open Infrastructure for Network Computing (BOINC), is today used for many other distributed computing projects such as attempts to search for gravitational waves, do climate modeling, and simulate particle collisions in the Large Hadron Collider.

image

Folding@Home, a project started by the Pande group at Stanford to use distributed computing to study protein folding uses a similar approach, albeit with different underlying software (is it any wonder that a Stanford group doesn’t use Berkeley’s distributed computing platform?! :-D ) . It has probably been the most successful distributed computing approach to date, and, as a testament to the power of distributed computing, has become known as the first computing system to break the petaFLOPS barrier – e.g. capable of one quadrillion floating point calculations per second! This has enabled the team to do protein-folding simulations on a scale of ~10 micro-seconds.

But, as impressive as the science achieved by distributed computing projects is, what impresses me the most is that projects like Folding@Home and SETI@Home have defined some brilliant new ways to do science:

  • Use the internet – It’s a common theme on Bench Press, but with more and more people having faster and faster access to the internet, the potential for distributed computing becomes greater and greater. As Folding@Home demonstrated, such approaches can produce computing systems as powerful (or potentially more powerful) as leading supercomputer systems at a fraction of the cost.
  • Mobilize the public – We’ve discussed ways for the scientific community to reach out to the public like using social media and creating interactive applications/tools for the public to use, but efforts like Folding@Home illustrate a way to not only reach out to the public but to get them vested in science. In a world where high school science teachers find it difficult to get teens interested in science, initiatives like Folding@Home have created a system where teams of individuals compete on who can contribute the most to the effort! Instead of simply hoping that the public will continue to fund and listen, why not borrow a page from the many existing cancer-walk-a-thons and make it easy for the public to get involved?
  • Leverage new technology – It may not come as a surprise to our readers that a significant amount of the computational power at Folding@Home comes from graphics cards and Playstation 3’s. But, while many “mainstream” supercomputers ignored the new power afforded by these new chip types, Folding@Home developed software so that volunteers could quickly and easily use these powerful chips to boost their Folding@Home scores. The Folding@Home initiative also developed software to take advantage of innovations AMD and Intel included in their chips (new multi-core architectures and special instructions to speed up calculations). Is it any wonder, then, that Sony, NVIDIA, and AMD have all publically announced support for the initiative with their products?

image

I don’t pretend that every scientific problem is amenable to a distributed computing initiative, but to some extent, I believe that every scientific endeavor has something valuable to learn from the success of Folding@Home and SETI@Home and their brethren. To that end, I sincerely hope to see an open-source distributed computing architecture like BOINC but with:

  • Support for new chip technologies – To provide greater value to the scientific effort, the architecture should support new chip technologies like Intel’s SSE extensions, SMP, or stream processing
  • Client contribution tracking – To make it easier for volunteers to know how much they’ve contributed and/or have contests on how much they’ve contributed, a simple system to enable users/administrators to track the effort is needed
  • Better security – Medical initiatives and volunteer privacy concerns demand that very fine and specialized security controls are necessary. Support for sophisticated encryption and authentication are a must.
  • Linkage to social media – This probably seems extraneous, but since distributed computing efforts depend on motivated volunteers actively seeking out new volunteers, a successful architecture needs to make it easy for volunteers to share their progress with their friends whether it be via blog, or social network, or Twitter, or anything.
  • Tie-in with new cloud computing systems – Along the theme of cutting costs, it is reasonable to assume that as offerings like Google’s App Engine and Amazon’s EC2 and technologies like MapReduce become better developed, we will see cash-strapped research groups using the power of “Clouds” to hold their computing power – after all, what is distributed/grid computing other than a specific variant of cloud computing (de-localized, pooled computing)? It’s probably necessary, then, for the new distributed computing architecture to more easily link with EC2 or MapReduce or App Engine.

Anyone else have any thoughts?

(Image Credit – picture of the internet) (Image Credit – Folding@Home computing power)

They’re not just for gaming

View Comments

image

There was a time when video game consoles and graphics cards were “just for games.” In those days, game console chips and graphics cards were the domain of little boys, not grown men. Well, thank the stars those days are long gone!

Today, if someone were to tease a grown man for purchasing Sony’s Playstation 3, he could simply reply, “I beg your pardon. I am a grown man, not a little boy. I am clearly using the Playstation 3, not to play great games like Grand Theft Auto IV and Metal Gear Solid, but to use its TeraFLOPS (1 trillion floating point calculations per second) capacity to solve important and complex scientific problems.”

It almost sounds like a fantasy, but it’s not. The idea behind this is pretty basic. To make games and graphics run smoothly, video game console chips and graphics cards have to do a mind-boggling number of calculations much faster than a basic computer chip can. It “just so happens” that the supercomputers scientists and Wall Street analysts use to do simulations and research with also need to do those same types of calculations. Hence the idea of Stream Processing was born – why not use graphics card/game console chips for things which aren’t directly related to graphics or gaming?

Why not indeed? I can’t list all of the projects out there, but here’s just a snapshot of the scientific applications that people have been able to do with the Playstation 3’s unique chip, IBM’s Cell Broadband Engine, and graphics cards from NVIDIA and AMD:

Technology – it’s good for more than just playing games.

Image Credit

Written by ben

August 30th, 2008 at 7:17 am