Sunday, June 10, 2007

Stream Computing with AJAX (Part 3)

I mentioned in Part 2 of Stream Processing with Javascript that I would talk about its morality.

However, there is something more pressing. I've been trying to work out exactly how to stop users reverse-engineering the Javascript Stream Processing system and injecting spurious results. However, it seems that it is a common issue with Calculate@Home projects. From the Folding@Home FAQ:

Unlike many computer projects, the paramount concern is not functionality, but the scientific integrity, and posting the source code in a way that would allow people to reverse engineer the code to produce bogus scientific results would make the whole project pointless.

The main problem with AJAX stream computing is that users haven't deliberately gone and downloaded the software to help the cause, they are having the calculations forced upon them. Because of this, it is likely that there would be more hostility towards any such project and hackers would be more likely to try to cause malicious results.

As the code is executed on the client, and by the very nature of Javascript being interpreted, it is very easy to reverse-engineer any such code. And because it's so easy to reverse-engineer, it's very difficult to do any sort of encryption/confirmation.

Of course not everyone out there is going to try to hack the system, so one solution might be to calculate each batch of data on several different machines and then, given a discrepancy, recalculate that batch.

Still to discuss, optimisation - in particular bandwidth optimisation through local data storage - and morality.

Edit: Since writing this I've realised that you just need to take the majority vote from a number of results

2 comments:

Anonymous said...

If you are looking for a way to prevent reverse engineering javascript, forget about it. You can make it a little more difficult to read at best. Java or flash is maybe a better choice, but even then, a descrambler and a packet sniffer will tell anything needed to hack.

To prevent data polution, you only have to let the server verify the found solutions. Finding a solution is time consuming, not the verification.

Or send the same packets to different users, and compare the results.

To get around XMLHttp security is simple, don't use it, but use iframes instead.

A problem you didn't mention: Javascript isn't multithreaded. So as long as the script is running, the browser window is locked and using 100% processor time.

It's possible to write some sort of threading emulator, with the help of timers, but that doesn't really work for time consuming calculations.

Anonymous said...

I like the idea of verifying the solutions with the server, however I'm concerned that many problems would take too long to verify on a server.

For example one step in a complex simulation could only be verified by re-running that step.

I like the redundancy idea of sending packets to different users.

The Javascript locking out the browser could be an issue, I guess I'll have to keep the packets to be processed small.