Forums | MacLife
You are not logged in.
#77 2007-09-23 10:18 am
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
ArtemisG3 wrote:
otsep wrote:
I do have F@H on my Intel iMac, maybe it's helping the numbers. I have the Xserves running from 1700 to 0700 every day.
By the way, are you able to allow your Xserves to run full-time on weekends? That is what I am doing with my workstations. My points get a big boost on Sunday and Monday. My crontab looks like this:
0 18 * * 1,2,3,4,5 cd ~/Library/.FAH/; nohup ./fah5 -local -advmethods -forceasm &
0 7 * * 1,2,3,4,5 killall fah5
This common sense change has really helped my numbers. Thanks again!
Offline
#78 2007-09-23 12:27 pm
Re: Distributed(Xgrid) F@H
otsep wrote:
This common sense change has really helped my numbers. Thanks again!
You're welcome. It good to see your already good numbers improving.
Offline
#79 2007-09-25 11:19 am
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
It appears that Team production is settling in at 60,000 per day, but watch out on weekends with Otsep's production around 35-40,000 "normally".
Otsep's average protein is worth 210 points with 180 proteins completed per day. (some adjustments made, as well as rounding.)
I'm wondering if some proteins are not making the deadlines.
Also, some proteins have been known to crash, and once in a while, Folding must be re-installed.
So, a logfile inspection once in a while is worthwhile, but on 417 CPUs ??? There's got to be a way to get the last line of each Folding logfile to see what's going on and put it all into 1 file. Example line: Xserve Intel CPU # 177 : 28 % complete.
All in all, ggggggrrrrrrrrrrrrrrreat.
Offline
#80 2007-09-26 1:54 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
Nefarious wrote:
Otsep's average protein is worth 210 points with 180 proteins completed per day. (some adjustments made, as well as rounding.)
I'm wondering if some proteins are not making the deadlines.
Also, some proteins have been known to crash, and once in a while, Folding must be re-installed.
So, a logfile inspection once in a while is worthwhile, but on 417 CPUs ??? There's got to be a way to get the last line of each Folding logfile to see what's going on and put it all into 1 file. Example line: Xserve Intel CPU # 177 : 28 % complete.
Anything I could do to help output, i'll try.
tail -1 FileNameHere
The command above will give you the last line of the log. Then I'll capture all the outputs from each server and "cat" them together. Is there anything in particular I should be looking for?
Offline
#81 2007-09-26 1:56 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
Nefarious wrote:
Otsep's average protein is worth 210 points with 180 proteins completed per day. (some adjustments made, as well as rounding.)
Don't forget the first week or so I had problems getting everything installed on ym servers properly.
Offline
#82 2007-09-26 1:57 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
I just checked --- your daily average for the past 7 days is over 36,000 today.
Offline
#83 2007-09-26 2:34 pm
Re: Distributed(Xgrid) F@H
otsep wrote:
Anything I could do to help output, i'll try.
tail -1 FileNameHere
The command above will give you the last line of the log. Then I'll capture all the outputs from each server and "cat" them together. Is there anything in particular I should be looking for?
I'm not sure if that is the best way to monitor your clients. If f@h crashes, there may or may not be any indication in the FAHlog.txt. One idea is to simply get the modification date of that file. If it is older than a day, you will know that something is not right.
Every now and then, f@h will not be able to connect to stanford's server, and it will write those errors to the log. If it can't connect after so many attempts, f@h will shutdown. Assuming that the connection error is with stanford and not local to your machine, then when f@h launches again on its next cron cycle it will try to connect again, hopefully successfully.
Offline
#84 2007-09-26 3:37 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
Another method is to check the CPU usage of each machine. If the CPU usage is low, then of course Folding is not running properly. Color me awkward on diagnosing 400 machines.
Offline
#85 2007-09-27 1:41 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
I looked through some logs and figured some stuff out. Most machines never peak above 80% CPU with 120% available. Going along these lines I'm going to have fah running on 1 processor full-time and only on the second during the previous schedule. What do you think?
Last edited by otsep (2007-09-28 12:04 pm)
Offline
#86 2007-09-27 3:46 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
Sounds very good to me.
As for myself, the difference between my G4 iMac and the Intel iMac in point production is around 20x. Amazing, isn't it.
Offline
#87 2007-09-27 3:57 pm
Re: Distributed(Xgrid) F@H
It's worth a try. Maybe just do that to a couple machines for a day or two before changing all of them.
Offline
#88 2007-09-28 3:33 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
Seemed to work out good. New crontab for multiprocessor machines.
Code:
0 17 * * 1,2,3,4,5 cd ~/Library/.fah1; nohup ./fah5 -local -advmethods -forceasm &> /dev/null & 5 7 * * 1,2,3,4,5 cd ~/Library/.fah2; nohup ./fah5 -local -advmethods -forceasm &> /dev/null & 0 7 * * 1,2,3,4,5 killall fah5
Offline
#89 2007-09-28 4:05 pm
Offline
#90 2007-09-28 4:12 pm
Re: Distributed(Xgrid) F@H
I'm not certain about this, but I think the machineid in the client.cfg needs to be a different number for the second processor. Can anyone verify this? I know when I created the Applescript installer (Quick Fold) for PowerPC's, I wrote a machineid=1 for the first processor and machineid=2 for the second processor. The FaHWiki mentions that this number can be 1-8.
Offline
#91 2007-09-28 4:54 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
A quick reply at Folding Support Forums confirms Artemis' suspicion.
http://forum.folding-community.org/fpos … tml#200028
Offline
#92 2007-10-01 10:42 pm
Re: Distributed(Xgrid) F@H
I just want to throw some ideas out there for otsep for upgrading clients to beta v6
Code:
crontab -l > oldcrontab; sed -e "s/local/local -smp/" -e "s/fah5/fah6/" oldcrontab > newcrontab; crontab -r; crontab newcrontab killall fah5; cd ~/Library/FAH/; curl -O http://folding.stanford.edu/release/FAH6.00beta1-OSX-Intel-Console.tgz; tar xzf FAH6.00beta1-OSX-Intel-Console.tgz; nohup ./fah6 -local -smp -advmethods -forceasm &> /dev/null &
Check this carefully because you may have to tailor it to your own installs.
Offline
#93 2007-10-02 9:00 am
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
ArtemisG3 wrote:
Check this carefully because you may have to tailor it to your own installs.
Nod. I no longer curl from Stanford servers. With proxies and all that jazz in the mix, I decided to keep the files on an internal web server. Basically, my script changes are very similar, and it seems to be working. When I get the chance, I'll send my new single file install script that I'm using.
Offline
#94 2007-10-06 9:41 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
Otsep has 434 processors within the past 7 days. Nice.
Offline
#95 2007-10-07 6:19 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
Nefarious wrote:
Otsep has 434 processors within the past 7 days. Nice.
44,000 points per day average ! 
Offline
#96 2007-10-08 3:06 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
OTSEP IN A STATISTICAL DEAD HEAT FOR 10TH PLACE FOR ALL USERS DAILY OUTPUT 7 DAY AVERAGE.
Hey, All Caps is pretty much a-okay in Folding@Home. 
Offline
#97 2007-10-08 7:53 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
Nefarious wrote:
OTSEP IN A STATISTICAL DEAD HEAT FOR 10TH PLACE FOR ALL USERS DAILY OUTPUT 7 DAY AVERAGE.
Goals:
1. Be in the top ten donors for daily output.
2. Break into the top 500 in the next few week.
So I'm trying to get more machines into the mix. Hopefully, I should get a big bump in the next week or so.
Last edited by otsep (2007-10-08 7:58 pm)
Offline
#98 2007-10-08 11:54 pm
- Nicholas D. Wolfwood
- Member

- Registered: 2007-04-20
- Posts: 355
Re: Distributed(Xgrid) F@H
Nefarious wrote:
OTSEP IN A STATISTICAL DEAD HEAT FOR 10TH PLACE FOR ALL USERS DAILY OUTPUT 7 DAY AVERAGE.
Hey, All Caps is pretty much a-okay in Folding@Home.
IT MEANS SERIOUS BUSINESS
If you go to the bathroom during Phydeaux's 30,000th post, you'll be sorry.
Offline
#99 2007-10-10 10:54 pm
- Nefarious
- Tuning Fork
- Moderator

- From: 45°22"N 84°57"W
- Registered: 2002-09-30
- Posts: 7996
Re: Distributed(Xgrid) F@H
Otsep has a production dropoff.
(As did I)
Offline
#100 2007-10-11 8:37 pm
- otsep
- Member

- From: Virginia
- Registered: 2007-09-06
- Posts: 88
Re: Distributed(Xgrid) F@H
Nefarious wrote:
Otsep has a production dropoff.
(As did I)
Yep... and i have more ahead... 
Offline

