Statistical methods
log in

Advanced search

Message boards : Science : Statistical methods

Author Message
Ananas
Send message
Joined: 8 Jun 13
Posts: 128
Credit: 1,947,833
RAC: 0
Message 2591 - Posted: 19 Sep 2015, 21:36:24 UTC
Last modified: 19 Sep 2015, 21:38:26 UTC

I have a problem understanding how why projects like RNA-World, QuantumFIRE, QMC and this one work.

This is how I understood it :

The method all these projects use is a statistical one. Statistical methods work somehow similar to swarm intelligence, a sample consisting of a single result is worth nothing, it might be anything between totally wrong and exactly right. The larger the sample becomes, the more exact will be the average(!) value of all results combined.

But :

All the mentioned BOINC projects calculate one single value using a single seed value (most random seed, RNA fixed seed for validation) for each problem, which - if my assumption is correct - would be worth nothing. Wouldn't it be required to calculate each ligand/receptor combination a lot of times with different random seeds, then combine the results and see in which numeric range the result count becomes more dense - something like the peak of a (lin or log) normal distribution?

Profile Ben
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 17 Nov 14
Posts: 316
Credit: 1
RAC: 0
Message 2592 - Posted: 22 Sep 2015, 9:29:07 UTC - in response to Message 2591.

The score does not change that much when you take an other seed.
For example, it varies from -6.44XXXX to -6.45XXXX.

We are trying to find a duo ligand/receptor with a very negative value. The enormous amount of tasks came from the size of the genome we are screening.

The duo will be then tested on lab, by measuring the thermostability.

Does it make sense?

Ananas
Send message
Joined: 8 Jun 13
Posts: 128
Credit: 1,947,833
RAC: 0
Message 2594 - Posted: 22 Sep 2015, 22:47:47 UTC - in response to Message 2592.

The score does not change that much when you take an other seed.
For example, it varies from -6.44XXXX to -6.45XXXX.

We are trying to find a duo ligand/receptor with a very negative value. The enormous amount of tasks came from the size of the genome we are screening.

The duo will be then tested on lab, by measuring the thermostability.

Does it make sense?

Basically yes, thanks :-)

I assumed that the result range would be much wider, which would have enforced several samples for the same pair, so no promising candidates could slip through.

But then ... if there is such a minor influence, why use a random seed at all instead of having the same seed value for all calculations? You wouldn't need to store the seed in the database if it was a constant.

Profile Ben
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 17 Nov 14
Posts: 316
Credit: 1
RAC: 0
Message 2595 - Posted: 23 Sep 2015, 8:23:37 UTC - in response to Message 2594.

You're right, I could have chosen to have a single seed, but I just didn't thought about it...

Message boards : Science : Statistical methods


Main page · Your account · Message boards


Copyright © 2017 Dr Anthony Chubb