Skip to content

Parasitic Computation and JavaScript

by Jonas Huckestein on September 2nd, 2010

Parasitic computing is a technique [...] that uses the legitimate function of computing hosts to perform some other computation.

From Gian David Perrone‘s Master’s Thesis [PDF]

According to a Nielsen report broadband-connected Americans spend 2.6B hours online every month. That’s about 300 computing-years/month in the US alone while people are actually in front of the screen. The global internet population is 2B people! That is a lot of largely unused processing power.

There’s projects that use idle processors to solve difficult problems (SETI@Home, Rosetta@Home, SuperDonate, etc). Most of those projects are based on BOINC and require the installation of a client software. All BOINC projects have a combined all-time user-base of 2M users. That is 1/1000 of the internet population (probably less, given that BOINC runs on servers as well). For a mind-blowing list of volunteer computing articles click here.

I think we can do better.

Parasitic JavaScript and MapRejuice

This weekend we (@shiftb and @ryanmcgrath and me) built MapRejuice for the Node Knockout competition. I finally found some time to reflect on what we learned and where we’re at, so here it is.

MapRejuice is a distributed Map Reduce implementation in JavaScript. A website owner can insert a small piece of code into his website and his visitors will start cranking away at MapRejuice jobs. This computing power can be sold and the webmaster gets a cut. Goodbye advertisements? Not quite yet. There’s a few things to figure out first.

Distributed computing in JavaScript has been attempted numerous times before:

The problems commonly associated with Parasitic Javascript are

  • Security: The worker scripts could modify the host site. Also the data and algorithms used in the computation are visible to the clients.
  • Performance: A few years ago JavaScript implementations were  still rather slow.
  • Latency: Most distributed computing applications are not l because the input data is so large. The latency from sending this data to workers would kill any large-data computation.
  • Reliability: Workers in this kind of setup are not trustworthy and fail often (when the user navigates away form the host site)
  • Restrictiveness: The in-browser JavaScript runtime is not sufficient for complex calculations. There is no binary protocol and data cannot be encrypted. Data cannot be streamed. Data cannot be requested and sent back across different domains.

Using the upcoming WebWorkers standard makes it impossible for workers to modify the host site and the performance of modern JavaScript engines with JIT compilers is not too bad. Some of the restrictiveness may be solved by technological advances, but let’s not count on that.

What now?

So what have we got? We can run jobs on large numbers of unreliable, non-trustworthy workers with slow transfer speeds in a restricted, sandboxed environment. Now we’re looking for a class of problems that are well suited for that scenario.

Most Map Reduce jobs seem to run some sort of analysis on terabytes of data with chunk sizes of 64MB. That’s not going to work in our case. But what if the data sizes were generally small, or the data was publicly available, or the data could be randomly generated? How about a distributed count of words in Wikipedia? Or a Monte Carlo simulation for investment portfolios (thanks Tobias!)? Or an analysis of a person’s interests and habits on the internet?

As I said, there’s a few things to figure out and the technology may not quite there yet/widespread. But that doesn’t mean that it’s not worth a shot. And before we know it the next wave of technologies such as Google’s native client will be available and we can run BOINC in everybody’s browsers :)

We’d appreciate any input on the topic since we’re new to distributed computing.

Update 1: Voting at Node Knockout has ended and we came in second in the innovation category and seventh overall. Not too bad :) Thanks for all the support and congratulations to the winners!

Update 2: We got covered in the MIT Technology Review (errr, the blog at least). We also got a shout-out on the changelog (starts at 8.30min)

No related posts.

From → Technology

13 Comments
  1. I made jsdc a few years ago, and more recently I made http://distributed-pi.appspot.com/static/beta.html (Someone else made http://www.csc.liv.ac.uk/~acollins/pi which does pretty much the same thing with more pretty charts and a non-broken implementation).

    • jonas.huckestein permalink

      Updated with links antimatter’s links. Thanks!

  2. How about protein folding? The Folding@Home project is a good place to start looking at this problem. It is in a class of problems that don’t require much data but solving them requires a massive search over a combinatorial space. If that search can be made parallel a MapRejuice approach might be useful.

    Along those lines you may want to look into SAT problems. Perhaps the SAT competitions site is a good place to start: http://www.satcompetition.org/

    • Hey Mark, those are great ideas! The SAT competition would have been a fantastic proof of concept. Ah well :) Next time.

      How long would a typical Folding@Home computation run? If a computation runs longer than a visitor remains on a website, then the time will have been waster.

  3. Aaron permalink

    Fantastic!

    You get this going and I will click on your processing URL before I got to bed and let it process all night. I won’t load special software, but I want to help.

    BTW, a beautiful looking graphic while it’s working always helps.

  4. Tim Kelley permalink

    Hey Jonas -

    Sent you a message on Facebook. Trying to get ahold of you about some code writing, but I’m not sure how to find you! Katharina Juranek recommended me to you. If you could, check that message and get back to me?

    Tim Kelley

  5. I’m excited to find this site. I need to to thank you for your
    time for this particularly wonderful read!!

    I definitely loved every part of it and I have you bookmarked to check out new stuff in your blog.

  6. Good web site you have got here.. It’s hard to find high-quality writing like yours nowadays.
    I really appreciate people like you! Take care!!

  7. Nice post. I learn something totally new and challenging on websites I stumbleupon everyday.
    It’s always exciting to read articles from other writers and
    practice something from other web sites.

  8. Nice blog! Is your theme custom made or did you download it from somewhere?
    A theme like yours with a few simple tweeks
    would really make my blog jump out. Please let me know
    where you got your design. Appreciate it

  9. Undeniably believe that which you stated. Your favorite justification appeared to be
    on the internet the easiest thing to be aware of.
    I say to you, I certainly get irked while people
    consider worries that they just don’t know about.
    You managed to hit the nail upon the top and defined out the whole
    thing without having side-effects , people could take a
    signal. Will likely be back to get more. Thanks

  10. Can I simply just say what a relief to find somebody who truly understands what they’re discussing on the web.

    You certainly understand how to bring an issue to light and make it important.
    A lot more people should look at this and understand
    this side of your story. I can’t believe you’re not more popular since you definitely have the gift.

  11. clash of clans hack tool () The clash of clans cheats tool features a instant unlimited gems hack free.
    There are a lot of games out there, but not
    all of them are good. Finally a message from super-star Clint Cruise:
    ‘Oooh yeah Get Cost-free Clash of Clans Diamonds shoo badaby dooo.

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS